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FROM THE EDITORS 


Flat will, ultimately, fall flat 


T he latest mobile application-devel¬ 
opment design trend favors a “flat” 
UI. Developers are being told that 
skeuomorphism is out and flat is in. 

The Merriam-Webster definition of 
skeuomorph is an ornament or design 
representing a utensil or implement. In 
other words, its a digital object made 
to resemble a real-life object. They’re 
used to make the new digital versions 
look like the old, familiar ones. Mobile 
applications have always used skeuo- 
morphs to try to make the digital expe¬ 
rience more closely resemble that of 
real life. 

In Windows Phone 8, Microsoft 
moved away from skeuomorphism in an 
attempt to make mobile apps appear 
“authentically digital.” And in their 
upcoming iOS 7 release this fall, Apple 
is shifting their UI design from skeuo- 
morphic to a flat design. 

Following the lead of companies such 
as Microsoft and Apple, more and more 
mobile app development tool providers 
are also moving away from skeuomor¬ 
phism in favor of flat. Some tool 
providers say it will improve app per¬ 
formance and save on battery life, the 
rationale being that flat mobile apps will 
take less time to perform actions and 


A pple’s developer site was not 
hacked by some international con¬ 
spiracy, nor was it broken by a malicious 
former employee. It was taken down by 
a bug in Apache Stmts that allowed 
remote code execution on their servers. 

The thing to note here is that the 
exploit in Stmts was made public on 
June 24. Apple reportedly was informed 
of the problem back on May 10. The 
actual patch for this exploit was released 
on July 16, and Apple’s site was hacked 
shortly thereafter. 


load in all the necessary visual effects. 

One of the most popular arguments 
in favor of skeuomorphic design, 
though, is that it makes it easier for 
users familiar with the original 
device—a rotary phone, or a Rolodex— 
to use the digital emulation of it. Since 
their earliest days, mobile apps have 
been created to look as “real” as possi¬ 
ble, incorporating elements that closely 
resembled real-life objects. Heck, that’s 
why mobile applications were cool in 
the first place. 

We think this is where mobile appli¬ 
cations got their fanbases and why users 
will just not like the new flat UIs. No 
matter how old or young you are, as a 
user, you want a great user experience 
when using your cellphone or tablet. We 
think skeuomorphism is not outdated; 
it’s still an important UI design trend. In 
mobile, after all, the user experience is 
key. A mobile application’s UI has a 
direct effect on user experience. 

You can’t go backwards in design, 
removing all the pleasing visual aesthet¬ 
ics in mobile apps to which users are 
accustomed. Flat will not be visually 
appealing. Flat will look boring in mobile 
apps. Flat is a step back. The current 
“flat” craze will, ultimately, fall flat. I 


For the rest of the world, this means 
that you need to be up to date on all of 
the exploits in your stack. Proper cod¬ 
ing procedures don’t mean a thing if a 
remotely exploitable bug is lying in wait 
inside your stack. If you find out that 
something in your stack is vulnerable, 
you really need to drop everything and 
patch it ASAP. In a world with more 
than a billion people online, there is 
now zero time between when an exploit 
hits BugTraq and when it is in use in 
the wild. Plan accordingly. I 


WebRTC: 

Still about 
the codec 

I n speaking with the World Wide 
Web Consortium about HTML5 
and WebRTC, the issue of the video 
codec keeps coming up. Should the 
group accept VP8, the Google-backed 
codec, or H.264, which Microsoft, 
Apple and many others already have 
implemented, but comes with a 
licensing fee? 

As of today, WebRTC specifies only 
one codec: VP8. A big part of that rea¬ 
son, WebRTC specification editor 
Dan Burnett told SD Times in this 
issue’s cover story, is ubiquity. Only 
that which can be implemented inex¬ 
pensively will become nearly univer¬ 
sally used. 

So the issue for the W3C becomes 
this: What is the least amount it will 
require of the codecs to guarantee 
interoperability? Currently, there is no 
way to do a conference call between 
browsers using different codecs. 

Yes, there are transcoding gateways 
to convert VP8 media streams to H.264, 
but the benefit of direct, real-time 
transmission is lost, and there’s a cost 
associated with that conversion. 

So the W3C finds itself in a bit of a 
spot. It doesn’t wish to alienate those 
using the H.264 codec, while it 
acknowledges through the WebRTC 
definition that the VP8 codec is the 
logical choice for ubiquity. 

The W3C should plant its flag in the 
VP8 ground and say this is the codec 
the specification has chosen. If you 
want to have a standard WebRTC 
browser, this is how you’ve got to go. 
And while some might balk at first, 
we’re confident all will ultimately get 
on board. I 


Apple s security woes teach us all 
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SDTimes on the web 
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4 Wearable computing may be an important category in the 

• future, but that means integration with fashion designers ^ We remember the mOUSe-mail 
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Last month, pre-eminent engineer and inventor Douglas 

Apple 

Engelbart died at the age of 88. Most remember him for 

5 At US$1,500, "trust us, you'll like it” is a bit much (okay, ... . . ,. , . , 

designing and patenting the computer mouse in 1970, but he 
• more than a bit much). The numbers have to drop to iPod . . , . , , ...... 

also worked on bitmapped screens and hypertext. In this age 

level. I want to be able to buy this on a whim. What is that price , ... . ... ,, . 

of mobile computers, where hands are the primary input 

point? I'm not sure, but it seems to me that price is a major _ , . ,. , 
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stumbling block. . . . 
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Your call: Apple or Samsung? 

Which smartphone do you prefer, the Apple 
iPhone 5 or Samsung Galaxy S4? 

Apple and Samsung are in a smartphone dogfight right now. 
Apple sold 31.2 million iPhones by the end of June, compared 
to 26 million the same time a year ago, beating Wall Street 
expectations by 5%. Despite the gains, 

Samsung's staggering numbers still 
crushed Apple's revenues. Samsung sold 
a record 71 million smartphones during 
the second quarter. But numbers aren't 
everything. Given the choice between the 
iPhone 5 and the Android-running Galaxy S4, 
which would you choose? 

Vote in our Linkedln poll and tell us 
which side you lean. Whether you prefer the 
functionality, usability, interface or operating 
system of one over the other, we want to know. I 
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SD Times wants 
to hear from you. 
Join us on Linkedln 
and Facebook. 


FEEDBACK 

A life-changing decision 

Regarding your article “Stand and Deliver” (July 
2013, p. 50), I’ve had a “Next Desk” standing desk 
and a LifeSpan walking treadmill for four months, 
and I love it. I walk an average of 20 miles a day at 2 
mph. It hurt at first, but since my legs got used to it, I am 
now more comfortable walking than 
standing or even sitting. I’ve lost 
weight and rarely stop when emer¬ 
gencies arise. It is definitely healthier 
for me, but I also find I can concentrate 
as well as if not better than when I used 
to sit for hours. Changed my life. 

David Leppek 
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Tod Nielsen 
talks Heroku 

Leaving VMware to join Salesforce, 
company’s new CEO is taking 
the temperature at his new place 


BY ALEX HANDY 

Tod Nielsen has been in the software 
industry since it was only first becoming 
an industry. In the 1980s and 1990s, he 
worked at Microsoft, cutting his teeth on 
projects that would eventually lead their 
respective markets. In the 2000s, Nielsen 
worked at BE A Systems and Oracle, and 
eventually headed up Borland as CEO. 
He then took the COO job at VMware. 

But in June, Nielsen accepted the 
position of CEO of Heroku, a surpris¬ 
ing choice given that Heroku is owned 
by Salesforce, and that VMware and 
Salesforce have not exactly been friend¬ 
ly with one another. He took over the 
job from Heroku s former CEO, Byron 
Sebastian, who is currently on a sabbat¬ 
ical that began only a few months after 
the Salesforce acquisition. We spoke 
with Nielsen about his time in the 
industry, and his plans at Heroku. 

SD Times: What are your priorities as CEO 
of Heroku? 

Nielsen: I have been fortunate enough to 
have been around the block a few times, 
and I have a lot of experience with devel¬ 
opers. But I also know the world has 
changed with the cloud taking off, so I 
am taking a lot of time with employees 
just learning and trying to understand 
whether my views and biases are correct. 

I’m meeting with every employee in 
the company, one on one. I’m trying to 
gain an understanding of what they do, 
why they came here, what they love, 
what we could do better. Once I get that 


data absorbed and collected, we’ll more 
clearly be able to put out a list of priori¬ 
ties. Really, for the first month, it’s to con¬ 
tinue to learn and absorb what we have. 

What's with the hate between Salesforce 
and VMware? They don't even compete 
with one another yet. That VMforce part¬ 
nership never even made it out of the gate. 

I think it’s more of an issue of down the 
road. Part of the issue back then was dif¬ 
ferent leadership. Marc Benioff has one 
style, Paul Maritz had a different style. 
They kind of clashed as both companies 
were trying to beat their chests over who 
owns the cloud, who is the pioneer in the 
cloud. They got messed up, and it all 
came to a head when Salesforce bought 
Heroku and took their own PaaS service. 
That led to the end of VMforce, and that 
caused some tensions as well. 

VMware was very much focused on pri¬ 
vate. Heroku doesn't even believe private 
exists or should be a thing. How do you 
rectify this switch in your direction now? 
The good news is I am glad I have that 
experience at VMware. They 
haven’t talked to many cus¬ 
tomers that have a requirement 
or desire for an alternative PaaS 
outside of Amazon at Heroku. I 
can bring that perspective. One 
of the things I’ve learned is that 
the Salesforce folks have done a 
tremendous job working with 
tomers. In customers’ minds, there is a 
private cloud they would have in their 
own data centers, there’s the Salesforce 
data centers, and then there’s the 



unwashed cloud at large. So Salesforce 
does a tremendous job of earning the 
trust of their customers that they can run 
a data center or cloud in a trusted way. 

There are a lot more private PaaS offerings 
now from many different vendors, most 
embracing polyglot. How does Heroku feel 
about polyglot PaaS? 

In many ways, it’s a logical movement. If 
I look back 15 years, you were either a 
.NET developer or a Java developer. 
These days, there’s so much fragmenta¬ 
tion, any vendor betting on one language 
is going to limit their broad appeal. So I 
think that’s going to be the name of the 
game for at least the next five years; that 
you’ll see most of the platforms try to 
provide support for a variety of lan¬ 
guages and frameworks and tools. 

Are developers more important to busi¬ 
ness now than they were when you started 
out, or are they just more visible? 

It’s a little of both. Clearly, the digital 
economy is happening where IP is 
transforming all kinds of businesses. 
There are books like “The 
New Kingmakers.” From my 
perspective, developers have 
always been crucial. Back at 
Microsoft, when I was creating 
developers in the 1980s and 
1990s, we believed they were 
the keys to driving Windows. I 
don’t know that their importance has 
changed as much as their visibility. 
What's changed the most since you last 
headed up a developer-focused company? 
The interesting thing is what happened 
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back in the 1980s. You had a large set of line-of-business peo¬ 
ple that can do a lot of development themselves. With prod¬ 
ucts like Ebase and Paradox and Access, there were a lot of 
people starting off as a wedding photographer, and they’d 
built a database and ended up in the software business. “I’m 
no longer a wedding photographer, I’m selling wedding-pho¬ 
tographer applications.” 

These days, the sophistication of developers has risen to 
where I don’t see that as much. It’s rare to find the line-of-busi- 
ness person who has no real formal training in development 
say, “I can build software and run my own stuff.” 

I think with mobile, the back end is changing and requires 
a certain level of complexity, and that’s what Heroku has done 
well: simplifying that so you can focus on your application 
rather than target platforms and all those issues. 

The biggest change is the cloud and what it takes to build 
an application of significant scale now is completely different. 
Think about the early 1990s, when Novell was at COMDEX 
in 1992. They were excited to show the world’s largest LAN 
application with 1,000 users. These days, 16 guys can start a 
company called Instagram and get 30 million users in 18 
months. You could never imagine that level of scale and adop¬ 
tion back then. 


Fondest memory up to this point? 

In my mind, I grew up at Microsoft. That’s where I cut my 
teeth, and I loved the early days of Microsoft. In the 1980s 
and up to the mid-1990s, it was just an incredible time to 
see a problem, be innovative, and drive and define an indus¬ 
try. I look back at that, as a lot of what I’ve been able to do 
and become I attribute back to those early days. I will fre¬ 
quently find if I am presented with a problem, I will think, 
“If we were faced with a similar problem back then, how 
would we have solved it?” 

We had DOS as a leader back then, but we were No. 2, 3 
or 4 in every other category. What it took to focus on cus¬ 
tomers and to win was very satisfying, and probably my fond¬ 
est memory. 

Is Byron Sebastian ever coming back? 

i— He’s enjoying tending his olive trees in Healdsburg. I know 
■■ Byron really well. I 

JBfl| : ■ :■ hired him to work at the 
4b startup I was running 

when I left Microsoft. 
I’ve known him for 13 
years. Before I took this 
job, I spent the day with 
him at his olive tree 
place to see what’s for 
real here. I can say he’s at peace, he’s happy, he’s enjoying it. 
He was just kind of burnt out on the whole tech sector and is 
cooling off. 

When Byron left, the day after he left he called me and 
said, “You have to take this job, it’s the best job in the indus¬ 
try, you have to take it.” It took me nine months to come to 
my senses. It’s a fit for who I am, and if I can help them out 
here, that’ll be great. I 
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Prezi's Evan Czaplicki, right, designed Elm to be secure, being statically typed while compiling down to JavaScript. 


Elm grows as an alternative to JavaScript 

Functional language project brings type safety to Web app development 


BY DAVID RUBINSTEIN 

Evan Czaplicki decided about a year and 
a half ago that his senior thesis at Har¬ 
vard would be about creating a function¬ 
al programming language with secure 
theoretical underpinnings, that is stati¬ 
cally typed yet compiles down to 
JavaScript for use with Web applications. 

Today, Czaplicki works at a company 
called Prezi, which, as the name 
implies, is creating presentation soft¬ 
ware it hopes will become an alterna¬ 
tive to Microsoft’s PowerPoint—and 
the language is called Elm. It is an 
open-source project hosted on GitHub 
and licensed via BSD3. Elm, which he 
said he wants to see become both an 
alternative and a complement to 
JavaScript, can be found at elm- 
lang.org. 

“The project actually got started at 
Google,” said Czaplicki, who spent some 
time there. The company wanted a solu¬ 
tion that kept the flexibility of JavaScript 
by eliminating the weaknesses in the lan¬ 
guage associated with the building of 
large applications. “Elm introduced type 
safety, but you keep the flexibility,” he 


said. “You can grow code that’s not frail 
and you can add features to it.” 

But Czaplicki said he wanted some¬ 
thing that was visible across platforms, 
and JavaScript is best for that. “Plus, the 
browser does some work too,” he noted, 
adding that he did not have to write his 
own garbage collector, for instance. 

Static is fine here 

The fact that Elm is statically typed 
“gives you a guarantee that you won’t 
get runtime errors,” said Czaplicki. “If 
there’s a bug, it’s because you didn’t 
understand how to write that piece of 
code. It’s not a typo.” In JavaScript, you 
get errors, such as trying to access a 
field that no longer exists or never exist¬ 
ed, he explained. “JavaScript tries, and 
then moves on. When the bug finally 
manifests itself, you’re far from where it 
is. With type safety, the bug is apparent 
at compile time,” he said. 

Using a statically typed functional 
language makes changing and refactor¬ 
ing code easier, Czaplicki added. It 
defines things such as, “This object has 
a certain shape and cannot be mixed 


with an object of a different shape.” He 
said this is a core principle in Elm. 

It is also why the project is such a 
good fit for Prezi. “The approach Elm 
uses is well-suited to presentations. 
They’re most declarative. ‘Here’s what 
my frame looks like, and here’s how we 
step through it.’ It’s a natural fit,” he said. 

The elm-lang.org website currently 
has an interactive compiler through 
which people can contribute to the 
project. “Right now, it’s great for exper¬ 
imenting,” Czaplicki said. “In two or 
three years, it’ll be more ready for a 
professional project.” 

One area that has to catch up is tool¬ 
ing. “There’s been tooling for impera¬ 
tive languages for 20 or 30 years,” he 
said. “For functional languages, it’s just 
emerging.” 

Czaplicki said people want an online 
editor, and the project has added the 
ability to do live coding and hot-swap- 
ping, so you can see the change imme¬ 
diately on the other side. “People want 
type information on all functions as 
they show up in the editor. It’s getting 
there,” he said. I 
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Java branches out 
to new platforms 

Oracle ties expansion to integration program 


BY ALEX HANDY 

Oracle in July released version 3.3 of the 
Java ME platform, designed to spread 
the usefulness of the Java language into 
the embedded systems market. It was 
just one of a series of updates across Ora¬ 
cle s product lines, as the company began 
rolling out the 12c versions of its data¬ 
base, frameworks, tools and platforms. 

Java ME version 3.3 is primarily 
focused on adding support for new 
platforms. This release actually includes 
a port of Java ME that runs on the Rasp¬ 
berry Pi, but the real meat of the cross¬ 
platform support comes from the Oracle 
Java Platform Integrator program. 

Peter Utzschneider, vice president 
of product management at Oracle, said 
that “the Oracle Java Platform Integra¬ 
tor program enables partners to either 
do more ports of Java ME embedded 
themselves, or to do value-add on top of 
Java Embedded. They could build a 
function API for healthcare and ship a 
custom binary for that industry.” 

Utzschneider said that the Oracle 
Java Platform Integration program is tied 
to an “encoding layer that makes it easier 
for people to pick up a Java ME refer¬ 
ence implementation and port it to a dif¬ 


Maven support comes with JDeveloper 12c. 


ferent chipset. The program is a way to 
expand the aperture of people and part¬ 
ners who have access to the market.” 

Oracle s push into 12c will culminate 
at its Oracle OpenWorld conference in 
September in San Francisco. Until then, 
developers can test-drive 12c-focused 
versions of JDeveloper, the Oracle Appli¬ 
cation Developer Framework (ADF), 
and the Oracle Enterprise Pack for 
Eclipse. NetBeans was also updated. 

For developers, the 12c updates 
mean that all three IDEs offered by Ora¬ 
cle support the 12c Oracle environment. 

Bill Pataky, vice president of product 
management for tools and frameworks 
at Oracle, said, “We’ve refreshed our 
tool offering to reflect the updating 
infrastructure Oracle Fusion Middle¬ 
ware brings to the table.” 

What's in it for cloud and mobile? 

Shay Shmeltzer, senior group manager 
at Oracle, said that the 12c editions of 
JDeveloper, NetBeans, the Oracle 
Enterprise Pack for Eclipse, and ADF 
all include benefits for both mobile and 
cloud developers. 

ADF has gained new components for 
generating applications. Automated code 
is included to generate timelines 
and reports, and to cut applications 
down to iOS’ screen size. 

Said Shmeltzer, “One very 
important area for us is the ADF 
Faces, which is our solution for 
building rich Web interfaces, and 
for getting those to mobile devices 
and tablets. It includes location 
support, and we can run on iOS and 
Android, using HTML5 render¬ 
ings, and thanks to improvement in 
the screening capabilities.” 

JDeveloper received the most 
holistic updates, as it gained core 
support for Git and better support 
for Maven. This Maven support is 


Oracle Application Development 
Framework version 12c includes a lot 
of new components and presentation 
layouts. The highlights are: 
Mobile/tablet support: 

Better support for touch-based user 
interfaces on tablets and adaptive 
layouts 

New, improved components: 

New data visualization components 
that display information in rich, mean¬ 
ingful ways using timelines, treemaps, 
list view, and sunburst visualization; 
easier customization with a new skin 
and visual skin editor 

REST support and other 
data-control improvements: 

A new REST data control simplifies 
integration of REST-based business 
services, and a new EJB/JPA data 
control provides extended functionality 
for EJB/JPA services integration into 
Oracle ADF applications 

extended to WebLogic, thus allowing 
for deployments to conform more easily 
to the version and dependency manage¬ 
ment imposed by a Maven workflow. 
JDeveloper also benefited from the col¬ 
laborations inside Oracles IDE team, 
as it now includes memory-monitoring 
code taken directly from NetBeans. 

This release also marks the beginning 
of a coming-together between Oracles 
Eclipse tools and its two full IDEs: JDe¬ 
veloper and NetBeans. Said Pataky, 
“Since the Sun acquisition, the NetBeans 
team has reported to the same group 
here: the Java development tools organi¬ 
zation within Oracle. I run product man¬ 
agement for all three tools. Each have 
their own role within the ecosystem.” 

And while each has its own role, they 
will all begin sharing more code, he said. 
“There are other features, like the Java C 
compiler, that are now shared because 
NetBeans is such a high-quality tool, its 
easy for us to leverage it in one IDE and 
bring it to the other. They [the NetBeans 
developers] are out there pushing the 
limits on the high-performance aspects 
of the language, and we’re able to lever¬ 
age that in JDeveloper itself. We are 
going to leverage things that have to 
change in both tools.” I 
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What does mobile ARM even mean! 

Users don't expect (or want) simply more speed, experts say 


BY SUZANNE KATTAU 

Confusion surrounds mobile application- 
performance management today, in part 
because the term is defined differently 
based on whom you ask. But a successful 
mobile APM strategy should focus on 
user experience, and include monitoring 
and managing mobile applica¬ 
tions across their entire life cycle, 
according to industry experts. 

Panos Papadopoulos, CEO of 
app performance-monitoring 
tool provider BugSense, said that 
one of the challenges inherent in 
mobile APM is that there is no 
specific definition of mobile rT 
APM on which the industry agrees. “Per¬ 
formance can mean downloads and rat¬ 
ings for some people; for other people, 
its network speed,” he said. 

“For us, mobile APM is more about 
the stability of the software. The most 
important thing for us in mobile is spe¬ 
cific transactions, how long 
transactions take.” 

Papadopoulos added that it is 
also important to measure the 
quality of applications because 
too many errors can negatively 
affect the user experience. 

Mobile APM is not just about 
app responsiveness or speed any¬ 
more, according to Kumar Bangarajan, 
CEO of mobile performance-analysis 
tool provider Little Eye Labs. He agreed 
that how the app impacts the user expe¬ 
rience is paramount, adding that app per¬ 
formance should always match what your 
users expect. 

“Performance is something that’s 
universal across all apps,” he said. 
“Speed is still very important, but 
mobile APM has to go beyond just 
monitoring speed.” 

Rangarajan said a mobile APM solu¬ 
tion should help developers track their 
apps data, memory, CPU and battery 
consumption. A big challenge in mobile 
APM is managing how apps affect bat¬ 
tery life, he said. “It can be frustrating for 
users to realize that their app is draining 


their battery,” he explained. 

Today, the mobile APM market is 
moving toward enterprises moving their 
business-critical functions onto mobile 
devices, according to Andrew Levy, CEO 
of mobile APM solution provider Critter- 


The definition of 



mobile APM depends 
on who's defining it, 
says BugSense's 
Papadopoulos. 


According to 
Crittercism's Levy, 
app monitoring must 
extend into app 
improvement. 


t 



Little Eye Labs' 
Rangarajan reminds 
users that app 
energy management 
matters too. 


cism. “As an organization, you can begin 
to think about how you are going to 
deploy applications across your organiza¬ 
tion,” he said. “How are you going to use 
these to move your business-critical 
functions onto smartphones and 
tablets?” 

Levy said this is, by far, the biggest 
trend that he has been seeing, and it’s 
happening across verticals. He said com¬ 
panies are also facing operational com¬ 
plexities as they change their point-of- 
service devices to smartphones and 
tablets. “We’re working with a lot of field 
operations, tracking truckers or parts, 
and inventory management, sales, oper¬ 
ations or insurance adjusters,” he said. 

“A lot of mobile apps are thin clients 


that may connect to, say, 20 disparate 
data sources or APIs. Any one of those 
can have a performance impact on the 
application.” 

Measuring success 

To date, the top two methods companies 
rely on to evaluate mobile application 
performance are user ratings and social- 
trend analysis, according to Levy. How¬ 
ever, the problem with these methods, 
he said, is that companies must be 
proactively finding application-perfor¬ 
mance problems before negative feed¬ 
back is made. 

“The industry as a whole 
needs to think of mobile APM as 
a proactive solution utilized 
throughout the entire life cycle 
of an app, not a reactive solu¬ 
tion,” Levy explained. “Addition- 
ally, companies have relied upon 
lab-based testing approaches, 
■ which do not account for the 
many complexities and issues that occur 
in real-world environments.” 

With mobile apps becoming increas¬ 
ingly vital to a business’ overall perform¬ 
ance, Levy said that it is important to 
manage and improve—not just meas¬ 
ure—application performance. “Thus 
the focus and purpose of mobile APM 
centers on helping companies detect, 
prioritize, isolate, diagnose, repair and 
prevent problems before users or a 
business are impacted,” he said. 

To have a successful mobile APM 
strategy, Levy said organizations need to 
monitor and manage mobile applications 
across their life cycle and at all stages of 
delivery. “Issues such as latency at the 
endpoints, the amount of data trans¬ 
ferred, and the bandwidth available can 
cause an app to crash or cause perform¬ 
ance issues,” he said. 

Going forward, Levy said that effec¬ 
tive mobile APM should include con¬ 
tinuous monitoring and management of 
network services, as well as application 
availability and response time to ensure 
the best user experience. I 





LEAD TOOLS VI8 

THE WORLD LEADER IN IMAGING SDKs MWHtRt 


.0 SDKs 


C++, C#, VB, JavaScript, Objective-C, Java, .NET & HTML5 





o 

Wh 

o 

tr# 

o 

Ns 


The world’s leading Imaging SDK 

NOW RUNS ANYWHERE 


RUNS ON DESKTOP, ZERO FOOTPRINT 

MOBILE & TABLET DEPLOYMENT 


MULTI-TOUCH 

VIEWERS 


OCR 


BARCODE 


ANNOTATIONS 


IMAGE PROCESSING 


PDF & PDF/A 

150 + 

FILE FORMATS 


DICOM DATA SET 


PACS QUERY/RETRIEVE WINDOW LEVEL 


lbad _ 

rmcHNOujGms 


SALESaLEADTOOLS.COM £8j ft£ { 

800.637.1835 


DOWNLOAD OUR 60 DAY EVALUATION 

WWW.LEADTOOLS.COM 










































I IlLWO I SD Times | August 2013 | www.sdtimes.com | 

iOS 7: The developer reaction 

Tools and support get scrutiny from community 


BY CHRIS BARYLICK 

Apples iOS is about to go through 
some major changes. The operating 
system, which has led to the sales of 
millions of iOS-based devices, is 
undergoing a dramatic overhaul for 
version 7. As part of this overhaul, 
developers at Apple have given the 
user interface a simpler, “flatter” look. 
TheyVe also incorporated changes into 
features such as Control Center, over¬ 
all multitasking, Notification Center, 
Safari, camera and App Store ele¬ 
ments, and they even claim improved 
efficiency for the battery. 

With the first iOS 7 SDKs out the 
door (the second and most recent one 
having been released on June 24) and an 
anticipated release date for the fall, 
members of the developer community 
are weighing in on the tool set that Apple 
has released, as well as its support of it. 

“One of the things that made the 


biggest difference is, honestly, the tool 
set,” said Jason Titus CTO of Shazam. 
“TheyVe made Xcode and instruments 
and the ability to debug your apps and 
understand whats going on under the 
hood. It made that much, much easier, 
and that makes a big difference for all 
developers and definitely for us. 

Titus went on to say that Shazam was 
looking to incorporate additional 
music-purchasing and push-notification 
features into future versions of its iOS 
app, something the iOS 7 SDK allows it 
to do (albeit without the level of speci¬ 
ficity to the notification system as his 
company would like). 

“It’s a little bit of a challenge that all 
that is hooked up to overall user notifica¬ 
tions,” said Titus. “So if a user says they 
don’t want to have in-your-face push 
notifications, then they currently won’t 
be able to have behind-the-scenes silent 
updates either, which isn’t ideal.” 


Even with these caveats, Titus stated 
that his developers were able to get a 
working build of its updated app up on 
literally the same day that the SDK was 
released. And his team is looking for¬ 
ward to improved multitasking in the 
upcoming operating system. 

More developer feedback 

In other cases, developers got what 
they needed right off the bat with the 
SDKs as well as some of the items on 
their wish lists. 

“I just wanted things to be faster,” said 
Bruce Morrison, CEO of Man Up Time. 
“I’ve yet to run into an issue where iOS 
can’t do what I need it to do. So for iOS 
7, my biggest hope was just increased 
performance and speed.” 

Morrison went on to admit that his 
apps didn’t require any of iOS 7’s key 
features and improvements, but he was 
impressed with what there was to work 
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Eight features of iOS 7 that every developer needs to know 


1. Auto-layout: With iOS 7, Apple has 
made it easier for developers to incorpo¬ 
rate layout technology to make sure lay¬ 
outs appear correctly on a user's screen, 
regardless of screen size or orientation. 

2. UIDynamics: iOS 7 includes physical 
attributes that can be assigned to views 
and pages on the app screen, including 
gravity, collision modes, snap, push and 
behaviors. 

3. TextKit: iOS 7's new typography frame¬ 
work has the potential to dramatically 
impact the design of custom apps. There is 

with and that Apple’s been responsive 
to his needs as a developer. 

One common sentiment expressed 
by developers was happiness over 
Apples developer support, which 
seemed able to address issues as quickly 
as they surfaced. 

“So far its been great, although that’s 
speaking from the standpoint of a 
developer that was lucky enough to get 
to WWDC, so I could access that sup¬ 
port first-hand,” said Aaron Fothergill, 


a greater emphasis on typography over 
images. Developers can adjust the size and 
weight of fonts across the app, as well as 
enable text-wrap around images. 

4. Multi-tasking: With multi-tasking 
enabled, developers can do work on the 
back end without impacting the user's in- 
app experience. They can fetch informa¬ 
tion for data-driven apps, and update a 
user's data before they even open the app. 

5. Control Center: Users can quickly get 
to frequently used controls, such as tog¬ 
gling various antennas. 

CEO and lead coder at Strange 
Flavour. “They work on bug lists. If it’s 
something you need fixing or think 
could be improved, file a bug report. 
That’s the best way to get it in front of 
someone who can sort it.” 

Fothergill said that the worst issue 
he’d seen with the iOS 7 SDK to date 
was in connecting to a motel’s Wi-Fi 
system when he traveled for a business 
conference. “But that’s normal for an 
early beta,” he joked. “I also don’t like 


6. AirDrop Sharing for apps: Enterprise 
app developers will be able to integrate 
real-time sharing of documents and con¬ 
tent, providing an efficient platform for 
sales, presentation and collaboration tools. 

7. iBeacons: Now available are low-cost 
transmitters that will be able to work with 
an enterprise's iOS device to collect loca¬ 
tion data—like location inside of a build¬ 
ing-even if a GPS system is not installed. 

8. Device and data security: iOS 7 
comes with features to enhance the user 
experience and increase enterprise secu¬ 
rity, including enterprise single sign-on, 
activation lock and per-app VPN. 

Source: J Schwan, Solstice Mobile 

the space bar position when entering 
search terms in Safari. I keep tapping 
the Tco.uk’ button that’s been added 
where the right side of the space bar 
used to be. 

“Remember, it’s a work in progress 
and you’re testing it,” said Fothergill, 
advising new developers and hobbyists 
looking to break into iOS 7 coding. “This 
also means stuff will break, and it’s better 
to contact the dev (or Apple) through 
their support link on the App Store.” I 
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A head start on code quality assessment 

New cloud platform offers model based on ISO characteristics 


BY SUZANNE KATTAU 

Quality-assurance solution provider 
Optimyth Software recently announced 
Kiuwan, a new cloud platform that lets 
developers measure and analyze the 
quality of their application’s code in the 
early stages of the software develop¬ 
ment life cycle, before they test their 
applications. 

The Kiuwan platform uses static 
code analysis to analyze the risk and 
compliance of applications built in the 
ABAP IV, C#, COBOL, Java (including 
some Android support), JSP, PL/SQL, 
SQL, VB.NET and VB6 programming 
languages. The company said support 
for C/C + + , JavaScript, Objective-C, 
PHP and extended support for Android 
in Java will be added in September. 

Kiuwan also helps developers gather 
information on the kind of defects that 
can be detected with static code analy¬ 
sis. 'Tor example, we don’t run the 
application to see what happens when 
you click a button,” explained Javier 
Salado, marketing and business devel¬ 
opment director at Optimyth. “That 
[kind of defect] is covered by unit test¬ 
ing and functional testing.” 

Salado said that the defects Kiuwan 
can detect are ones that could break an 
application, and added that Kiuwan can 
find them even before the application is 
built. “For example, having if(a=l) 
instead of if(a= = l) can break your 
application in many different ways, but 
can be difficult to detect with function¬ 
al testing,” he said. “We can detect this 


defect and other code pat¬ 
terns that can be harmful. 

We can detect compliance 
defects and non-functional 
defects that affect the effi¬ 
ciency or reliability of your 
application before you run 
performance tests.” 

Kiuwan gathers infor¬ 
mation on defects based on 
rules, Salado said. “You can 
modify the rules if you 
want, but we give you a 
standard set of rules, a stan¬ 
dard set of metrics to run,” 
he said. “All these rules are compliant 
to the ISO 9126, which is a standard 
that basically recommends characteris¬ 
tics of software quality.” 

If a developer’s application meets the 
five characteristics in what is known as 
Kiuwan s Global Quality Indicator, then, 
Salado said, developers will remain 
compliant to ISO and can be assured of 
their applications quality. “The five 
characteristics we measure are main¬ 
tainability, reliability, efficiency, porta¬ 
bility and security,” he explained. “ISO 
actually gives you more, but those are 
the five that we have implemented into 
our quality model.” 

Salado added that its Quality Indica¬ 
tor is based on the ISO 9126, but that it 
does not cover usability because usabil¬ 
ity cannot be measured with static code 
analysis. 

For developers concerned about 
uploading their code outside their fire¬ 


walls for analysis, Salado 
said they can download 
the Kiuwan analyzers to 
run the static analysis 
locally. “This is software 
that you can easily install,” 
he said. “The software 
uploads only the results to 
the cloud, so you can still 
track the quality of your 
software from the cloud 
platform while your code 
stays secure.” 

Salado said another 
benefit of being able to 
run the analysis locally is that developers 
can integrate it with their continuous- 
deployment process. 

Developers can start using Kiuwan 
for free, and can continuously analyze 
up to three applications if they are 
smaller than 25,000 lines of code. “By 
'continuously analyze,’ we mean, as 
often as you need,” Salado explained. 
“The code-analysis should be part of 
the continuous deployment/integration 
process, so you can analyze on every 
check-in, every build, every deploy— 
whatever makes sense to you. Kiuwan 
does not have any restrictions or charge 
extra in the number of analyses.” 

Along with the Kiuwan platform, 
Optimyth also recently announced the 
free Kiuwan Early Adopters program. 
The first 50 companies to join will get 
unlimited access to the platform for a 
full year. The offer ends Aug. 31, Salado 
said. I 



Optimyth's Javier Salado 
says static code analysis can 
detect app-breaking bugs. 
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Privacy guidelines get an update 

App Quality Alliance urges using practices from the start 


BY SUZANNE KATTAU 

The App Quality Alliance (AQuA) has 
added privacy-related recommenda¬ 
tions to its mobile application devel¬ 
opment Best Practice Guidelines in 
light of increasing worldwide con¬ 
sumer privacy regulations. The new 
recommendations in Version 2.3 of 
AQuA’s Best Practices Guidelines are 
designed to help mobile developers 
address topics such as users 7 rights, 
location data, and information securi¬ 
ty and accountability. 

AQuA, a nonprofit mobile industry 
trade association, is comprised of eight 
member organizations: AT&T, LG, 
Motorola, Nokia, Oracle, Orange, Sam¬ 
sung and Sony Mobile. In March 2011, 
the first version of AQuA’s Best Practice 
Guidelines was released. 

However, using the latest version of 
the guidelines, developers can now 
also navigate privacy requirements 
during their application development 
and QA processes. “The Best Prac¬ 
tices are what you can use when 
you’re designing your application, and 
trying to work out how you should 
approach some of these aspects so you 
can avoid any errors in the early 
design stage,” said AQuA chairman 
Martin Wrigley. 

“Rather than just relying on testing 
at the final stage, its far more efficient 



AQuA's Martin Wrigley says new guidelines 
focus on earlier stages of development. 


in the development process to get it 
right from the very beginning.” 

GSMA's influence 

AQuA incorporated consumer privacy- 
focused recommendations into its Best 
Practice Guidelines by working direct- 

Five new topics in the AQuA 
Best Practice Guidelines 

1. Requirements for encryption 
of data 

2 . Recommended use of passwords 

3 . Guidance on privacy policies, 
user rights, and correct use 
of data collected 

4 . Importance of consent for use of 
location data 

5 . Information security and 
accountability 

ly with the GSM Association (GSMA), 
a mobile industry organization that 
comprises more than 400 mobile carri¬ 
ers. Wrigley said AQuA looked to 
GSMA’s Mobile Privacy Initiative, 
whose core objective is to help estab¬ 
lish universal mobile development 
guidelines and approaches that address 
consumer concerns. 

In 2012, the GSMA Mobile Privacy 
Initiative published the GSMA Priva¬ 
cy Design Guidelines for Mobile 
Application Development, which 
details 29 specific guidelines to help 
mobile developers in the area of con¬ 
sumer privacy. Wrigley said some of 
these were incorporated into AQuA’s 
guidelines. 

“We’ve incorporated the guidelines 
from the GSMA for mobile privacy 
because we feel that it’s important to 
have all this crucial information 
brought together in one place,” he said. 
“What we’ve done is show that, by 
working with other organizations, we 
can bring together a single set of best 
practices.” 


The inclusion of the GSMA mobile 
privacy recommendations within 
AQuA’s Best Practice Guidelines “fur¬ 
ther reinforces the importance of con¬ 
sumer privacy as part of the app-design 
process,” according to Pat Walshe, 
director of privacy for public policy at 
the GSMA. “Privacy-by-design is key to 
winning and keeping the trust of app 
users,” he added. 

Wrigley said it was at GSMA’s 
Mobile World Congress (MWC) con¬ 
ference in 2012 where AQuA first con¬ 
sidered including recommendations 
regarding consumer privacy in their 
application quality-focused Best Prac¬ 
tice Guidelines. “We were doing some 
sessions with developers and, as you 
know, privacy is a hot topic at the 
moment,” he explained. 

“Discussion came up on privacy, and 
we were asked if our Best Practices cov¬ 
er privacy issues; we were being asked 
what developers should do.” 

Wrigley said that when MWC 
attendees asked him what developers 
could do about privacy, his first 
impression was that it was a difficult 
topic because every country or state 
has its own jurisdiction. “But it comes 
down to...a number of good guiding 
principles that you can use, which 
actually then automatically satisfies 
the legislation in just about every sin¬ 
gle area, no matter whether if it’s Cal¬ 
ifornian legislation or EU legislation,” 
he explained. 

“The same underlying principles of 
transparency and control, paying atten¬ 
tion to what you’re doing with the data 
and how you hold it, making sure that 
your users are educated in what you’re 
doing with their data, these sorts of 
principles are universally applicable. 

“And that is what’s reflected in the 
Guidelines, giving developers a solid 
base from which to design their appli¬ 
cation, which should put them in a very 
good position to actually satisfy any 
legal requirements that they are hit 
with.” I 
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Google unveils Chromecast 

Plug-in allows syncing of mobile devices and TV sets 


BY ROB MARVIN 

July saw the debut of a new USB-sized 
device that allows users to stream 
videos and browser tabs directly to 
their TV from any device: the Chrome- 
cast, available for US$35. 

Google’s device turns every TV into 
a smart TV, using any Android, or any 
iOS phone, tablet or laptop, as a 
remote control. After syncing the 
devices through local Wi-Fi, users sim¬ 
ply click the “cast” button to play 
videos, tabs or music on their screen. 


Google demonstrated by playing 
YouTube and Netflix videos and audio 
from Pandora. 

Chromecast is immediately avail¬ 
able for purchase on Google Play, 
Amazon and Best Buy. The $35 pur¬ 
chase also comes with a three-month 
Netflix trial. Even more importantly, 
Google will soon release Google Cast 
SDK to allow developers to tailor 
streaming for other apps. 

Meanwhile, Google also released 
the Nexus 7, boasting enhanced ver¬ 


sions of Google Drive and Google 
Maps, but more importantly a Google 
Play Textbooks feature. Marketed at 
college students, the app allows easy 
purchase and storage of e-textbooks, 
an alternative to carrying a pile of 
them around. 

On top of that is the debut of 
Google Play Games, incorporating 
Google+ circles into your gaming 
friends list, and demonstrating incred¬ 
ibly detailed graphics in games like 
Prince of Persia and Asphalt 8. I 


Big Lever draws 'family trees 7 in Gears 

Software lets businesses group feature sets for multiple product versions 


BY DAVID RUBINSTEIN 

Organizations often struggle to man¬ 
age variations in their product line 
(such as auto manufacturers and com¬ 
panies that sell software in multiple 
versions) as decisions spread beyond 
engineering into the business side. So 
Big Lever software, one of the pio¬ 
neers of product-line-management 
software, has released Gears 7.2 with 
the ability to group projects by family 
tree. 

CEO Charles Krueger explained: 
“General Motors does product-line 
engineering on a mega-scale. Last 
year it produced 9 million vehicles, 
which translates to a car coming off a 
production line every three seconds, 
24x7.” 

To do this, he said, “You need 
everyone marching to product-line 
variants. GM calls it the bill of fea¬ 
tures. Its the common language peo¬ 
ple use to talk about the product line.” 

Using the notion of a family tree, 
features are grouped to form product 
variations. Using the automotive 
example, the top level would be the 
platform (in this case, the chassis). 
There usually are different chassis for 
sedans, trucks and subcompacts. 
From the platform follows the pro¬ 


gram, which is the sub-family of vehi¬ 
cles that can be built on a particular 
chassis. Regional programs take such 
things as climate and cultural prefer¬ 
ence into account, as features may 
vary slightly due to these factors, 
Krueger said. 

“A car is built out of brakes, info¬ 
tainment systems, steering systems, 
on and on for about 300 different fea¬ 
ture choices,” he said. “Its the flavor 
of braking, steering and infotainment 
that the company can use to create 
profiles of these vehicles.” 

Involvement from the top 

Those choices usually start at the next 
level—the trim level—which usually 
comes in base, standard and luxury 
variants. Beyond that comes the vehi¬ 
cle instance, with all the features 
required for one particular vehicle. 
That’s how you end up with an LE 
with a leather interior, sunroof, power 
passenger side seats, navigation sys¬ 
tem, antilock brakes, and whatever 
other features the customer might 
request. 

For software companies, which 
often release their products in free, 
professional and enterprise versions, 
bringing in business decision-makers 


and marketing people can be quite a 
challenge. 

“Commercial organizations pro¬ 
duce millions of product instances per 
year,” Krueger said. “This follows the 
systems-of-systems approach you hear 
about in engineering.” The product 
family tree mirrors this engineering 
approach by taking individual groups 
and pooling them to create more com¬ 
plex groups. 

So as application life-cycle manage¬ 
ment continues to more closely 
resemble product-line engineering, 
the need to manage development by 
feature increases. Gears 7.2 provides 
what Krueger called “down selection,” 
meaning starting from a place that will 
always be true for a particular family, 
then moving down to make choices. 
“In the new release, you can make half 
your choices and have it be a valid 
configuration,” he said. 

In the new release, businesspeople 
can work on the feature content, and 
when they’re done, the same product 
model can be used by the develop¬ 
ment to create the code. And, before 
it’s released into production, it can be 
checked to see if it’s feature-complete. 
“It covers the whole DevOps flow,” 
Krueger said. I 
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Telerik's HTML5 framework 
moves toward 'flat' Ul design 


BY SUZANNE KATTAU 

Mobile application develop¬ 
ment tool provider Telerik 
announced in July the avail¬ 
ability of the second of three 
scheduled releases of Kendo ^ 

UI, its HTML5/JavaScript 
framework for building web- fp 

sites and hybrid mobile 
applications. MS . 

The company said this 
release of Kendo UI focuses 
more on user experience, 1 
application design and application per¬ 
formance. “All three of those things are 
not necessarily built into the Web plat¬ 
form, so we feel like there’s an opportu¬ 
nity for tooling,” said Brandon Satrom, 
program-management lead of cross-plat- 
form tools and services at Telerik. 

In Kendo UI Mobile, the company 
has added a new “flat UI” universal 
theme. “Its removing a lot of transitions, 
gradients, effects and things like that,” 


said Satrom. 

“We’ve seen a lot of 
momentum around flat, both 
on the Web and in mobile. 
We’re moving away from this 
skeuomorphism of modern 
design that we’ve seen a lot in 
iOS and in other types of 
environments, and trying to 
create something that—to 
borrow Microsoft’s phrase— 
feels ‘authentically digital.’ It’s 
not trying to replicate a real- 
world concept, but rather [it creates] 
something that we know is digital, that 
lives and exists well in a digital world.” 

Satrom said flat is an increasingly pop¬ 
ular design trend that improves mobile 
application performance and saves on 
mobile battery life. “When you create 
something that has fewer gradients and 
fewer shadows and things like that, it’s 
going to be easier for a digital device to 
display it,” he said. I 


Syncfusion adds OLAP-based grid control 


BY SUZANNE KATTAU 

Windows component provider Syncfu¬ 
sion has released Essential Studio for 
JavaScript, a suite of 30 JavaScript 
client-side controls and components. 

The company said the suite is for Web 
developers who need client-side compo¬ 
nents that satisfy line-of-business needs. 
“Most JavaScript suites today weren’t 
built with enterprise applications in 
mind,” said Daniel Jebaraj, vice presi¬ 
dent of Syncfusion. “With recent 
advancements on the client-side, grids 
render faster, controls are more interac¬ 
tive, and data visualizations such as charts 
can go places never thought possible.” 

Essential Studio for JavaScript gives 
developers an API suitable for enter¬ 
prise applications, and includes gauge 


controls for dashboards, a grid control 
with grouping support, and an interac¬ 
tive chart control. It also contains an 
OLAP-based grid control that connects 
and visualizes business data for analysis. 
“With line-of-business applications, 
there’s a lot going on with OLAP—very 
complex visualizations that dwell in that 
domain,” Jebaraj said. “The OLAP con¬ 
trol shows how this suite is different 
from other solutions.” 

Jebaraj said what Syncfusion has 
done with the OLAP-based grid control 
is just a starting point. “You will see 
many more complex visualizations and 
many more ideas built around that and 
related domains, signaling that we want 
to go further and deliver more of these 
functional sets,” he said. I 


In other component news... 

Document, content and imaging 
solution provider AccuSoft recently 
released Prizm Content Connect for 
SharePoint v7.2, the company's 
embeddable SharePoint document 
viewer. The service pack update now 
supports enhanced digital rights man¬ 
agement in SharePoint, support for 
SharePoint Document Versioning, and 
saving of multiple named annotations. 

Microsoft component solution 
provider ComponentOne (a division 
of GrapeCity) recently released 
ComponentOne Studio Enterprise 
2013 v2.0, an updated suite of data 
and UI controls for Microsoft Visual 
Studio developers. The suite has a 
new CIRadialMenu control for use in 
Windows Store apps, and a new 
RichTextBox control that supports 
rich formatting and CSS. 

Database connectivity solution 
provider Devart recently released 
new versions of its dotConnect 
AD0.NET providers, which now sup¬ 
port Entity Framework Spatials and 
SQL Server 2012 Reporting Services. 
Entity Framework Spatials support 
has been improved in dotConnect for 
Oracle, and has been added to dot¬ 
Connect for MySQL and dotConnect 
for PostgreSOL. 

Imaging developer tool provider 
LEAD Technologies recently updated 
LEADTOOLS Version 18, its multime¬ 
dia imaging SDK. The H.264 Encoder 
shipping with the SDK has been 
updated, including hardware acceler¬ 
ation utilizing Intel's Quick Sync 
Video technology as well as High Pro¬ 
file capabilities. 

Data grid and data-compression 
component provider Xceed recently 
released Xceed DataGrid for WPF 
v5.0. The updated version features a 
new table-style view called TreeGrid- 
flowView that displays detail grids in 
a tree-like structure under the main 
column at the master level. 
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WebRTC puts real-time communicationjnto 
the browser, enabling audio, video and data to be 



exchanged in apps or websites without plug-ins 

BY DAVID RUBINSTEIN 


D n Sept. 8, 1966, “Enterprise” 
communication as we know it 
crackled to life when Capt. 
James T. Kirk of the Federa¬ 
tion Starship Enterprise flipped open 
his communicator and spoke these 
words to his crew: “Transporter room: 
Three to beam up.” 

Yet even earlier that decade—1964 
to be precise—the Bell Telephone 
Pavilion at the Worlds Fair in Flushing 
Meadow Park showed a family on 
Earth having a video chat with the fam¬ 
ily patriarch deployed to a space station 


orbiting high above the planet. Imagi¬ 
nations soared. 

Today of course, video chat is real, 
and so is audio and video streaming to 
remote corners of the world. If there’s 
WiFi, there’s communication. 

So what’s the buzz about WebRTC? 
Simply, it takes real-time audio/video 
communication to the browser, open¬ 
ing up possibilities for enterprises— 
businesses, healthcare organizations, 
governments and the media, among 
many others—to better serve their 
constituencies. And that, according to 


Dan Burnett (one of the editors work¬ 
ing on the WebRTC specification at 
the World Wide Web Consortium), is 
what will make it truly transformative. 

“We’ve had chat: text, voice and 
video. The ability has existed for a 
while. The difference is that no plug¬ 
ins are required. People hate plug-ins. 
They create security problems,” Bur¬ 
nett said. “To include audio and video 
almost trivially in a Web page is trans¬ 
formative. You’ll see for the first time 
the really ubiquitous use of video.” 

continued on page 34 ► 
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Data could be biggest piece of all 

WebRTC efforts have focused on the exchange of 
audio and video between browsers. But W3C spec 
editor Dan Burnett explained that data exchange 
could be the biggest beneficiary of this work. 

Using an as-yet unimplemented Data Channel API, 
he said, "You can send arbitrary and unstructured 
data through the channel." He pointed to Cube Slam, a game that uses the data chan¬ 
nel to publish ball and paddle locations, as one example of how this information can 
be passed. "There have been peer-to-peer data capabilities, but not browser to brows¬ 
er. Some people think it'll be bigger than audio and video." I —David Rubinstein 



◄ continued from page 32 

If you look at building a client that 
does Voice Over IP, you need a founda¬ 
tion of a microphone, camera, proces¬ 
sor and operating system. In a PC, most 
of that is there now. 

Next you need a visual interface. On 
a smartphone, that would be the but¬ 
tons. On a PC, its the screen. 

Finally, you need a media engine, 
which takes input, implements code to 
compress audio and video (and does 
echo cancellation), puts it into packets, 
and sends it to the right place. 

In WebRTC, there are actually two 
specifications being worked on side by 
side, Burnett said. One is WebRTC, 
which performs the real-time transmis¬ 
sion. The other is the Media Capture 
and Streams specification, informally 
known as the “getUserMedia” call, 
which is a method for gaining access to 
a local camera and microphone for use 
in an application or to send across a 
peer connection. “You cannot do 


WebRTC demos without Media Cap¬ 
ture and Streams,” Burnett said. “It 
defines media streams and media 
stream tracks of audio and video.” 

From the beginning 

The idea of being able to do real-time 
communication from the browser was 
first acted on at Google. When Google 
acquired Global IP Solutions in 2010, it 
acquired the most adopted media 
engine in VoIP. Google then open- 


sourced the code and put it into 
Chrome, and this became the genesis 
of WebRTC: real-time communication 
in the browser. 

An important milestone in WebRTC 
history was reached this year when Fire- 
fox released basic support for WebRTC 
in its browser without a user having to 
set an option, Burnett said. With 
Chrome and Firefox on board, “a signif¬ 
icant fraction of the browser market” 
continued on page 36 ► 


Wait a minute... Don’t we already have Skype? 



BY DAVID RUBINSTEIN 

Isn’t the promise of WebRTC already 
here? Doesn’t Skype represent the kind of 
real-time audio/video communication the 
specification spells out? 

Doug Michaelides, managing director 
of user experience design at software 
development company Macadamian, 
explained that “Skype is an application, 
while WebRTC is a set of technologies 
used to create applications to solve all 
sorts of business problems and customer- 
experience opportunities.” 

Among the ideas put forth at 
Macadamian s WebRTC: Transforming 
Communications event in Ottawa in May 
was, according to Michaelides, “One 
member of the audience talking about 
how a client of his was using WebRTC on 
their website to basically create a virtual 
buying experience in a catalog for jewelry 
and enabling people to have a conversation with a consultant 
to help them choose the quality of diamond or ruby or what¬ 
ever. You kind of start to get the sense for the ubiquity of 
being able to access real-time communication, and interac¬ 
tion from the website’s going to really enrich Web apps and 
mobile apps going forward.” 


Other scenarios include telemedicine, where potential 
organ donors can speak to a healthcare professional while 
making the decision to become a donor, or questions can be 
asked for a blood services group before giving blood, or where 
people with a certain ailment or using a particular drug can 
speak with others in the same condition. I 
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Peer-to-peer: Feature, or bug? 

The kind of communication most associated 
with Skype-type calling is peer-to-peer; that is, 
all connected nodes are both client and server. 

There is no centralized system underlying it. It 
is both the greatest appeal and the biggest 
drawback of the application. 

But for some business scenarios, conferenc¬ 
ing plays a big role, making peer-to-peer a big 
limitation. "Large telecoms are trying to solve 
that," said Jean-Francois Morin, software devel¬ 
oper at Macadamian. "Conferencing will reguire 
companies to do a lot more." 

Morin did say that one way around that is to 
use Flash as a third-party application that can 
unify all browsers despite the WebRTC codec 
they implement, allowing it to host up to 25 
people in a meeting room. 

This is an area being explored right now by a company called 
Watchitoo, which is working on a peer-to-server-to-peer archi¬ 
tecture that could, for example, allow a real-time video feed to 
stream from a camera directly into a browser window, which can 
then scale to open the video and audio stream to an exponen¬ 
tially larger group. 

"It's all APIs," said Nathan King, senior solutions director at 
Watchitoo. "That's still at least a 
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year away though, because we're all still sitting behind fire¬ 
walls," which block packets from unknown peers. "But the 
browser manufacturers are adopting it and running with it." 

King added, "The point we are trying to make is that there 
needs to be a medium through which multiple people can commu¬ 
nicate that sits in the middle, allowing the end users to make the 
minimum amount of connections (because of limited bandwidth) 
to expand the maximum amount of users that can participate." I 

—David Rubinstein 


◄ continued from page 34 

supports WebRTC today, he 
added. 

“Now, all the application 
has to do is manage the inter¬ 
face. The media engines 
already in the browser,’ 
explained PKE Consulting’s 
Phil Edholm, producer of the 
WebRTC Conference. “All 
the app is doing is making 
API calls.” 

So the communication 
interface becomes the browser 
itself—no download required, 

‘With HTML5, the concept of 
apps morphs,” Edholm said, 
“browsers look like apps. 
Download a pointer and have 
an app experience. Now, that 
can include real-time voice, 
video and data. Point your browser 
at Skype and have the full experience 
without downloading the client.” 

To take it to the next step, Edholm 
said, “If we’re connected to a website, 
the site can initiate communication 
between our browsers.” This, he 
emphasized, represents a profound 



change in how communication 
occurs. 

Prior to this, services had to 
talk to each other to negotiate 
communication on behalf of peo¬ 
ple. With Skype, Edholm said, if 
you’re not a member, you’re back 
to server-to-server communica¬ 
tion, not peer-to-peer. 


It was at CERN in the late 
1980s that the notion of creating 
a browser to talk to a server farm 
led to the concept of the World 
Wide Web. And, in almost all infor¬ 
mation systems since 1993, you 
point a browser at a server and cre¬ 
ate an event with that server. 
WebRTC enables communication to 
follow the same paradigm. In the 
next three to five years, Edholm said 
that as many as 4 billion devices will 
be WebRTC-enabled. 


Communication as a secondary event 

Edholm estimated there are 20 million 
to 40 million active websites, and that 
half of those could be capable of host¬ 
ing communications. “Eighty percent 


to 90% of transactions are preceded by 
a visit to a website,” he said. “There’s 
something the user can’t resolve, so he 
calls an 800 number listed on the site. 
That site could easily move to contextu¬ 
al communication.” 

Aside from the kind of communica¬ 
tion for when a website senses a user’s 
input error, where a box pops up and 
asks if he or she would like to start a 
chat with a customer service rep, 
Edholm said he sees a time when “if 
two of us are looking at the same prod¬ 
uct on Amazon, we can have a real-time 
discussion with other potential pur¬ 
chasers” about price, reliability and oth¬ 
er factors that go into a purchasing 
decision. And the users did not come to 
the Amazon site for that communica¬ 
tion; they came to buy a book. The 
communication is secondary, but offers 
tremendous value to the user. 

“The biggest im¬ 


pact of the Web was 
not created by tech 
people,” Edholm said. 
“At eRay, some guys 
wanted to sell antique 
Pez dispensers.” I 
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Open source and commercial solutions roll out 
to help organizations do the heavy data lifting 


ig Data may be the new big 
buzzword, but its not an entire¬ 
ly original concept. With some 
careful digging, you can find organiza¬ 
tions that have already been building 
with and for Big Data for years. Whats 
really changed in the Big Data revolu¬ 
tion is that there are now numerous 
tools, both open-source and commer¬ 
cial, for handling all of that data. 

Bill Yetman, senior director of engi¬ 
neering at Ancestry.com, has a data 
problem as big as history. (Your history 


BY ALEX HANDY 

and my history, to be precise.) Ances¬ 
try, com doesn’t just track family trees; it 
houses historic documents and infor¬ 
mation that is used to verify family his¬ 
tories. And it’s been doing it for more 
than 10 years. 

In practice, that makes the business 
more like a data-analysis firm with a 
public-facing interface than a tradition¬ 
al genealogy company. And it also 
means that its workflows on data have 


evolved into a replicable pattern: an 
example of how Big Data should flow 
through an enterprise. 

“We’re a pretty classic data ware¬ 
house. We’ve been an enterprise data 
warehouse for 10 years,” said Yetman. 
“With that, as we’ve gone over time, 
we’ve turned around and put a lot of our 
behavioral data, engagement data, and 
other info—like how users are building 
their family trees—out into the data 
warehouse, which is a little bit back- 
continued on page 40 ► 
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wards. A lot of that data is very unstruc¬ 
tured, so we’re taking it and putting it 
into a structured data warehouse.” 

And this, despite its prescience, is 
the future planned for Big Data: stor¬ 
ing everything, even user interaction 
data, back in the central repository. It 
is the active Big Data plan that can 
then take that information and turn it 
into actionable intelligence for the 
business people. 

Merv Adrian, research vice presi¬ 
dent at research firm Gartner, said that 
data has, up until now, been captured in 
a lossy fashion. It wasn’t possible for 
businesses to tabulate every possible 
variable and condition in every single 
transaction. The store temperature, 
sales status, and prior customer pur¬ 
chase information were unknowable 
elements when the focus was on track¬ 
ing the money and the inventory in a 
reliable fashion. 

A place to put it 

One of the biggest reasons for the new 
desire within enterprises to capture 
and analyze any and all data, struc¬ 
tured or unstructured, is that there is 
now a place to put all that information. 
The Apache Hadoop project has revo¬ 
lutionized how companies handle and 
utilize their longer-term business 
information. 

“One of the first things we did with 
Hadoop was use it to capture all our 
logs,” said Yetman. “The way we did 
this was to turn around and pull that 
behavioral data out of the data ware¬ 


house and into Hadoop, where we’d do 
the analytics and the processing and 
evaluation. It’s almost ETL happening 
in Hadoop.” 

Other Big Data analysis platforms 
have also sprung up to accommodate 


the market’s newfound need for Big 
Data tools and platforms: 

Sqrrl’s Accumulo platform is built 
from the U.S. National Security 
Agency’s now-infamous data-gather- 
ing platform. LexisNexis, the leg¬ 
endary subscription-based data-min- 
ing service, open-sourced its HPCC 


Big Data platform in 2011. Twitter’s 
Storm project offers a different 
approach to processing Big Data, and 
the UC Berkeley Spark project pushes 
the ideals of Hadoop even further 
while building on top of the HDFS 


file system. All the while, Hadoop 2.0 
is being brewed by the elephants of 
Hortonworks. 

With so many data platforms and 
processing environments to choose 
from, the prospect of devising a Big 
Data strategy for the developers at your 
company can be daunting. And don’t 
worry, you’re not overreacting: It’s 
extremely daunting. 

It’s made even more daunting by the 
prospect of your higher-ups having 
caught the buzz of Big Data. Gartner’s 
Adrian said that he’s even had calls 
where the client has said, “The CEO 
read about Big Data in a magazine on 
an airplane, and he got off and said, 'We 
need Big Data.’ What does that even 
mean?” 

Even worse, you can’t expect to hire 
your way into a Big Data solution. Adri¬ 
an said that talent is at an extreme pre¬ 
mium at the moment. He cited Gartner 
estimates that there will be a need for 
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The big guns of Big Data 

IBM, Microsoft and Oracle are the world's largest database companies. And while they 
all offer plenty of data-analysis tools and layers for IBM's DB2, Microsoft's SQL Server 
and Oracle's 11g, all three companies have also embraced Hadoop as a new solution. 

Yet, despite having adopted Hadoop into their product lines, all three companies 
have completely different takes on how software should be consumed by enterprises. 

Microsoft, for example, pushes its HDInsight service from Windows Azure. Within 
Azure, developers can spin up and manage a Hadoop cluster, and run batch jobs 
across data stored there. HDInsight is also available for Windows Server for organiza¬ 
tions looking to work with an on-premise solution. 

Oracle, on the other hand, sells Hadoop as an integrator adapter for its existing^ 
hardware and software solutions. Oracle also entered an agreement with 
Cloudera in 2012 to provide the Cloudera Enterprise Hadoop distribu¬ 
tion to its customers. 

Finally, IBM has had, likely, the oddest engagement with 
Hadoop. Rather than simply selling IBM PureData System for 
Hadoop and other connectors for its database, the company actu¬ 
ally took its Hadoop knowledge to Jeopardy. IBM's Watson machine used Hadoop to 
help it understand questions, and answer them in the form of a question. I 

—Alex Handy 



4.4 million new data-knowledge work¬ 
ers to handle the demands businesses 
will be creating in the next decade. 

Whats a development manager to 
do? Yetman has some advice. “How do 
you identify and capture whatever is 
all of your data? Because ‘all’ is a lot. 
What we’ve had to do is approach 
things in an iterative manner. We iden¬ 
tify a key set of data and go after it. We 
figure out the right way to ingest it and 
provide high partitioning at a simple 
level. How do we get it in the hands of 
someone who can take a look at it and 
see if it’s providing the value you want? 
You can’t just boil the ocean. You have 
to go after things one piece at a time. 
Find the data that’s going to be the 
most valuable for you as a business and 
attack those first.” 

Moving the data 

At the beginning of any Big Data strat¬ 
egy design endeavor, the first choice is 
fairly similar to the first choice in busi¬ 
ness: Where do you set up shop? For 
Big Data, this typically leads to a few 
early decisions on platform. For 
Hadoop users, HDFS or Cassandra are 
the first choices to consider. Other 
options, such as running Hadoop on top 
of ZFS through Lustre and other file 
systems, are becoming more viable over 
time as solutions there mature. 

But even after choosing your plat¬ 


form’s file system, there are a dozen 
other data-focused decisions to be 
made. How will you store your relation¬ 
al data within this platform? How about 
your unstructured data? How will you 
manage flat files and versioning? And 
what methods will you be using to 
access all of this information? 

For relational data, HBase has 
matured into an increasingly capable 
data-management platform for tradi¬ 
tional enterprise datasets that require 
ingestion and availability within a 
Hadoop cluster. 

According to Justin Erickson, direc¬ 
tor of product management at Cloud¬ 
era, HBase’s community has been 
focusing on “two big buckets of things. 
First is stability and durability, and the 
second is the ease of use of the whole 
platform. There’s been some general 
work to harden HBase, so there are 
less bugs. Some of what we’ve been 
doing around replication—and the 
recent work around snapshots—are 
examples of things we can do to make 
it easier for new developers to go to 
the system.” 

Those replication and snapshot 
changes help with the general use of 
HBase by developers and administrators, 
said Erickson. One of the primary needs 
of many developers is simply the ability 
to quickly test things locally, on their 
desktop or laptop. He said that Cloudera 


and the HBase community have been 
working to make it clear how to proto¬ 
type in this fashion using HBase, a feat 
that requires a number of additional 
moving parts beyond just HBase itself. 

While replication and snapshot 
capabilities are evolving in HBase, 
MapR has long made its name from 
offering these capabilities in its 
Hadoop platform. Tomer Shiran, 
director of product management at 
MapR, said that his company’s 
Hadoop distribution “provides 
point-in-time recovery things like 
snapshots and disaster recovery.” Con¬ 
sidering that Hadoop clusters tend to 
range into the petabyte region, disas¬ 
ter recovery can save an organization 
considerable time and money. It’s an 
important consideration for any Big 
Data cluster. 

“Having that full high availability 
across every layer of the stack is unique 
to MapR as well,” said Shiran. “You 
look at any other system in the enter¬ 
prise, whether that’s Teradata, NetApp 
or Oracle, they all have these capabili¬ 
ties. They have to have these.” 

So while open-source Big Data plat¬ 
forms like Hadoop are enticing, they 
are often devoid of the enterprise func¬ 
tions required by most organizations. It 
is for this reason that so many different 
Hadoop distributions are available for 
enterprises: Cloudera and MapR have 
their own distributions, Hortonworks 
favors vanilla Apache Hadoop, and oth¬ 
er companies like WANdisco, Microsoft 
and even Netflix have their own distri¬ 
butions. 

But data processing doesn’t begin 
and end only with Hadoop as the stor¬ 
age framework. HDFS is not the only 
way to go. The Apache Cassandra proj¬ 
ect has proven quite popular as a 
NoSQL storage system for working 
with Hadoop, said Gartner’s Adrian. 

Elsewhere, Leon Guzenda, co¬ 
founder and CTO of Objectivity, said 
that graph databases are a useful alter¬ 
native tool for analyzing the relation¬ 
ships between data, something that 
Hadoop is aiming to support through 
the Giraph Project, which is not yet 
finished. 

continued on page 42 ► 
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Objectivity, on the other hand, has 
been offering its InfiniteGraph data¬ 
base as a solution for Big Data needs. 
Guzenda said that InfiniteGraph is a 
good alternative to NoSQLs and 
Hadoop as it allows developers “to find 
relationships not based on statistical 
correlations. There’s other data in there 
none of these things touch: It’s in the 
relationships between the data. It 
might be just straight visualization of 
the graph of the network.” 


Processing the data 

Once the Big Data is in place, it’s time 
to write the actual code that will 
process it. Be it in Accumulo, Hadoop, 
SAP, SPS or any other system, writing 
software to read and process all of that 
information is no small task. 

For users of today’s Hadoop, this 
means writing Map/Reduce jobs in 
Java. For users of the forthcoming 
Hadoop 2.0, that means writing just 
about any batch-processing job you can 
think of, possibly in numerous lan¬ 
guages. 

Arun Murthy, founder of and archi¬ 
tect at Hortonworks, said that Hadoop 
2.0 will enable significantly more use 
cases for all that data stored in HDFS. 
“As we looked at Hadoop three or four 
years ago, we saw that people were put¬ 
ting all the data into HDFS,” he said. 
“It doesn’t discriminate against data 
types, but Map/Reduce was the only 


A new Spark for Big Data 

While Hadoop has undergone more than three years of rewriting in order to reach 
version 2.0, many of the features being brought into the project with that release are 
already available in Spark, an open-source project out of UC Berkeley. 

Spark, at its core, is a near-real-time take on Hadoop. It's a platform for processing 
large amounts of data, but there are some fundamental differences with Hadoop. For 
starters, Spark is based in memory, and the data that it spreads across its cluster 
resides in RAM. While HDFS is still the default file system, the medium of the data stor¬ 
age is inherently faster than the disk-based storage used in Hadoop. 

That means batch jobs can be run at a significantly faster rate than 
those that run on Hadoop, simply by virtue of the data being more 
readily accessible through a faster medium. 

As a result, Spark can also do stream processing, which is 
Storm's forte. 

Finally, Spark is designed to save developers time by using a concise API and focus¬ 
ing on using Scala as a language for writing jobs. Spark is still in its early stages, but 
it's already offering many features that are only available elsewhere in Hadoop 2.0.1 

—Alex Handy 



way to get at that data.” 

The natural next step, he said, was to 
change Hadoop to support other types 
of workloads on all of that data. Now 
that there was a place to put all that 
data, Map/Reduce wasn’t enough to 
accomplish the myriad tasks corpora¬ 
tions wanted to perform on top of it. 

This desire leads to the use cases 
currently filled by Twitter’s Storm proj¬ 
ect: stream processing and manipula¬ 
tion of information in real time. Murthy 
said that the desire for this capability is 
already rearing its head in the form of 
companies pushing to enable real-time 
SQL queries on Hadoop. 

YARN is the next generation of 
Map/Reduce for Hadoop. This project 


Storm on the horizon 

Twitter's Storm Project is an example of what should be expected from the open- 
source community in a post-Hadoop world. While Hadoop is bound to disk and 
Map/Reduce jobs only, Storm is a massive harness and queue for processing streams 
of information. 

Storm is still in its early stages, but many startups have already rolled it out into 
production. At its heart, Storm is about taking streams of data and performing com¬ 
putations on them, in a fault-tolerant, highly scalable manner. 

One early user of Storm is Groupon, which uses the platform to normalize 
address information for its customers. As properly formatted, inter¬ 
pretable business location information can be tricky to collect and keep 
up to date, Groupon passes all incoming information through a Storm 
system that performs more than 40 computations on the information. 

At the end of the queue, an address has been checked for spelling, 
age, veracity, proper punctuation, duplication, and dozens of other 
possible errors that could be found. Storm, while more focused on real¬ 
time processing of streams, is still a Big Data platform to keep an eye on. I 

—Alex Handy 



seeks to split up the two major func¬ 
tions of the Job Tracker in Hadoop: 
resource management, and job sched¬ 
uling and monitoring. YARN allows 
each individual application to have its 
own manager, which in turn allows 
more jobs to be run on a Hadoop clus¬ 
ter concurrently. 

“YARN becomes this generic 
resource,” said Murthy. “We wanted to 
do streaming event processing; there is 
a need for interactive SQL, and we 
have the Hive and the Stinger proj¬ 
ects... YARN seemed like the right solu¬ 
tion. We’ve been working on this for 
three years,” and he felt as though the 
work is almost done. 

Once all those jobs are finally 
brought over to Hadoop, however, there 
is one sticking point every developer can 
relate to: optimization. For standard 
applications, attaching a debugger and 
some performance-monitoring tools are 
as easy as a few clicks in the IDE. But 
when your application is spread across 
100 servers, each with its own data store 
and hardware quirks, performance 
management can be a nightmare. 
That’s why Compuware is 
offering its application perform¬ 
ance-management tools in the 
Hadoop marketplace. Michael 
Kopp, technology strategist for Com¬ 
puware APM Center of Excellence, 
said, “Our business unit inside Com¬ 
puware focuses on application per- 
continued on page 44 ► 
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formance management. We help our 
customers around the world trou¬ 
bleshoot and manage the performance 
of their mission-critical applications in 
production settings and pre-produc¬ 
tion settings. In the last year and a 
half, customers have come to ask us to 
apply application performance moni¬ 
toring for their Big Data applications. 
We have applied APM to Hadoop so 
we help our customers cut down on 
effort and time for problem-solving 
and finding performance issues 
inside Hadoop.” 

Analyzing the data 

Once the data is in place, perhaps the 
most difficult task in this new Big 
Data world is writing the batch-pro¬ 
cessing Map/Reduce jobs themselves. 
It is in this skill space that Gartners 
Adrian described a lack of employable 
candidates. And while numerous 
tools, such as Cascading, Hive, Pig 
and the new Stinger Initiative all 
attempt to give developers an easier 
way to access and process data inside 
of Hadoop, there’s still no easy way to 
bring an entire team up to speed on 
writing Map/Reduce, short of paying 
for some training. 

Objectivity’s Guzenda said, “I think 


Is Big Data just a Big Bubble? 

Looking around at the conferences, webinars, ads and venture capital investments, it 
certainly sounds and looks like a bubble. But to be a bubble, the fundamental change 
at hand must be an ephemeral and not-quite-ready-for-prime-time sort of change. 
When speaking to Big Data experts on the potential of Hadoop and other analytics 
platforms, it's clear that this isn't some shallow coat of paint on existing tools. The 
business value of Big Data is very real, and wide-reaching. 

Ryan Betts, field CTO at VoltDB, said, "The nature of a bubble is you never really 
know when you're in it. That being said, I think Big Data is very real. I think it's more 
than a buzzword. We already see the impact in our day-to-day lives. 
It impacts us every time we use the Web in the way ads are target¬ 
ed at us. I truly believe this is going to be transformative to retail. 
Everywhere you carry a phone, there is an impact in Big Data. 
You're taking with you a sensor that identifies you. I think it's real¬ 
ly real in the same way the Internet seemed like a buzzword, and 
turned out not to be. I think Big Data is really going to be about data and 
mobile, and tools like HDFS to extract data from those feeds." I —Alex Handy 



we need predictive analytics; it has got 
to be a lot easier to use. It’s really con¬ 
fusing for people who don’t have a 
grounding in statistical methods. It’s 
very easy to come to incorrect conclu¬ 
sions if you don’t understand what the 
statistics are doing. I’ve seen some 
interesting examples of that: The fact 
that the numbers 8 and 5 appear in my 
phone number, and my height have 
nothing to do with each other.” 

One of the fundamental require¬ 
ments of any data-analysis project is the 
ability to go traipsing through the data 



Source: The Apache Software Foundation 

In YARN, the Resource Manager and the Node Manager form the data computation framework. 


to figure out just what you’re working 
with. One of the easiest ways to do this 
is with a search tool. 

LucidWorks makes just such a tool. 
Originally formed to support the 
Apache Lucene, Solr and Nutch proj¬ 
ects, LucidWorks now offers open- 
source enterprise search tools that can 
be embedded in existing applications, 
or on top of Hadoop. 

Grant Ingersoll, CTO of Lucid¬ 
Works, said that his company comes 
directly from the same place Hadoop 
comes from. “A lot of people don’t real¬ 
ize Hadoop started as part of the 
Lucene project to assist with building 
large-scale distributed indexes, and 
along came Yahoo and said, We can 
use this for other use cases.’ We take a 
connector-based approach. If you’ve 
got Hadoop, we’ll treat that like data,” 
he said. 

LucidWorks is also partnering with 
MapR to spread its search solutions to 
those customers. 

Another company that has long 
made a business of traipsing through 
unstructured data is Splunk. Coming 
from the IT administration side, where 
analyzing logs can require searching 
for a single line among millions, Splunk 
has grown into an enterprise data-dis- 
covery tool with a powerful interface 
for developers. 

Clint Sharp, senior product manag¬ 
er for Big Data integrations at Splunk, 
said, “What’s different about Splunk is 
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LucidWorks' search tool can be embedded in applications or on top of Hadoop. 


that we don’t require you to do struc¬ 
turing and analysis of the data in 
advance. There are no ETL require¬ 
ments. I don’t have to give you the 
data in a tabular form. Give us the data 
however it sits, and Splunk will be able 
to read that data and give you the abil¬ 
ity to do charting and analytics on it. 
We’re allowing them to do the analyt¬ 
ics on the data without having to do a 
whole lot of upfront investment in 
order to do analytics on top of that.” 

This is a different mindset from tradi¬ 
tional business analytics, where the ques¬ 
tions must be prepared before the data is 
massaged into a form where the answers 
can be gleaned. Hadoop and Splunk 
require no prior data normalization, 
which means the data flow pipeline does¬ 
n’t need to have dozens of embedded 
transformations. Indeed, this can cause 
some confusion for traditional enterprise 
users. 

“The first piece of advice I would give 
is ‘Don’t throw it away,’” said Sharp. 
“The key to analyzing the data is having 
it. Find a place to store it. We don’t care 
if that’s in Hadoop or not. Keep the data. 

“Second, my advice would be [that] 
the world of ETL is, from my perspec¬ 
tive, a wasted investment. Rather than 
doing a bunch of reformatting and 
structuring, you’re going to chew 
through a lot of labor doing that. When 
we worked on a business-intelligence 
project, well more than 50% to 70% of 


the investment in an analytics or [busi¬ 
ness intelligence] project is just collect¬ 
ing the data and getting it into the right 
rows and columns: figuring out how to 
structure the data so I can ask it the 
right questions.” 

Those days are over, he added, 
thanks to unstructured data analysis 
platforms like Hadoop and Splunk. 

That’s good news for Ancestry.corn’s 
Yetman, who’s been putting all of this 


Big Data technology to use at his com¬ 
pany. The goal of all these technologies 
is to eliminate humans from the deci¬ 
sion process, and to use machine learn¬ 
ing to figure out when business 
events—like fraud, market opportunity, 
or individual customer sales incen¬ 
tives—are occurring, and to react to 
them instantly. Despite the slower 
nature of Hadoop processing at pres¬ 
ent, the future is in real-time, comput¬ 
er-assisted decision-making. 

Yetman’s team is already in the 
future, thanks to its Big Data experi¬ 
ence. “We’ve used machine-learning 


algorithms. Some of our records are 
hand-written, but others are typeset. 
City directories, which are older, will 
show the name of the person and the 
occupation, before there were phone 
numbers,” he said. 

“We’re using natural-language pro¬ 
cessing to turn around and pull out 
names, occupations and addresses. 
Obituaries are even harder. Can you 
identify the person who died, but also 


the spouse, the children, the surviving 
relatives? It’s where they were born, 
where they died, where they lived. A 
lot of machine learning is done to eval¬ 
uate that. Then we host the algorithms 
as a service and call the service with 
the new content to actually do the 
work. The whole idea is how can we 
take this content all the way to the site 
without a human Read this story on 
being being involved sdtimes.com 
so we can get it 
indexed on the site in 
a totally automated 
way.” I 
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W indows Azure is Microsoft’s 
cloud-computing platform, as 
well as the name of the 
assemblage of services that make it up. 
Initially designed to be a Platform-as-a- 
Service (PaaS), Windows Azure has 
been updated over the last couple of 
years (based on feedback as well as 
competition from Amazon and others) 
by expanding its options and offerings, 
and by adding more traditional Web- 
and server-hosting capabilities. 


BY PATRICK HYNDS 

Basically, if you want to do some¬ 
thing that requires systems attached to 
the Internet, you can get Microsoft to 
host it for you at a fraction of the cost of 
putting servers in your own data center. 
While that is the main promise of the 
cloud, Microsoft has taken it a great 
deal further. 

Microsoft’s Windows Azure business 
reportedly passed the billion-dollar 


mark in annual revenue recently, and as 
a result, it is hard to argue that it is not 
onto something with this cloud stuff. 
Aamir Shah, Microsoft senior cloud 
manager at En Pointe Technologies, 
said that “the tide has turned with the 
concept of cloud. Customers are ready, 
and they’re eager to find out what the 
starting point is. While the average per¬ 
son thinks that cost is the paramount 
factor for seeking a cloud solution, 
continued on page 49 ► 
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Will It Be Secure? 


Security is important to everyone, and questions about the secu¬ 
rity provided by Azure were the topic of the very first conversa¬ 
tions I had with Microsoft staff when Azure was first announced. 
There are certifications for various levels of government data 
and many other factors to consider when thinking about the 
security of a cloud platform or any -as-a-service provider. 

Microsoft's online properties see an enormous level 
of assault. Maybe this continuous barrage of 
attacks causes you to think your data is bet¬ 
ter off in your own data centers, but the 
safest places on the Internet tend to 
be those that get attacked the most, 
because they have withstood those 
attacks for so long that they have 
learned how to stay protected. 

Windows Server 2012 is a great exam¬ 
ple of the arms race that is online security 
these days. Windows Server 2012 bakes into the OS 
protections from cyberthreats on the level of the Stuxnet virus, 
and Windows Server 2012 and its hypervisor capabilities under¬ 
lie Azure. Microsoft's Ken Johnson and Matt Miller presented at 
Black Hat last year on exploit mitigation for Windows 8 (and 
Windows Server 2012). 

Stuxnet represents a serious threat to systems, and 
Microsoft looked at how it was approaching some of its attacks. 
Some of the biggest insights had to do with predictability in OS 


structures such as the heap. Windows 2012 makes it so that 
things are much less likely to be at the memory address that can 
be guessed by an attacker. 

Duane Laflotte, CTO of CriticalSites, agreed that there is no 
real safe harbor on the Internet (except for not being on the 
Internet). He cautioned organizations looking to use the 
cloud to “refrain from just throwing solutions up and 
hoping the walls hold the bad guys out. The truth is 
that encrypting data at rest-and in motion 
where practical-will go a long way to 
upping the security of your most impor¬ 
tant assets, whether they sit in the cloud 
or on-premise." 

When asked if he had any data on the 
cloud, and specifically on Azure, he said 
yes, adding, “I know that the security people 
at Microsoft are top-notch and are making the secu¬ 
rity of the Azure platform a top priority. To be honest, they 
are likely the most secure data centers you can find anywhere.” 

Security can be hard to prove. Thus far there are no major 
security breaches of Azure of which I am aware, and I am certain 
they would be widely and loudly publicized if and when they 
were found. The same is true of Amazon, of course, with regard 
to security. The winner in the security space will likely be the 
first one to retain a clean security record after the other has 
been breached. I —Patrick Hynds 
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we’ve seen scalability being the most 
compelling reason.” 

This rings true from my own experi¬ 
ences, where projects are often delayed 
by weeks or months as new servers on 
physical hardware (and even occasion¬ 
ally on virtual machines too) cannot be 
acquired and set up quickly. It frus¬ 
trates everyone in the process, and 
costs money in wasted time and missed 
opportunities. 

In light of all this, it is clear that now 
would be a good time to come to grips 
with some core questions concerning 
Windows Azure if, like many, you find 
you have let your grasp of the current 
state of Microsoft’s PaaS and Infra- 
structure-as-a-Service (IaaS) offering 
fall behind. In this article, we will cover 
the whys and hows that will help you 
navigate what has become a very com¬ 
prehensive set of offerings that can be 
bewildering at first to anyone who did 
not watch it evolve from the beginning. 

Microsoft has had to play catch-up 
(mostly with Amazon) since it first 


announced Azure, but now it competes 
with a wide array of other targeted 
cloud platforms as well. The initial 
problems with the offering were that it 
was an all-or-nothing undertaking. 
There was no integration story with on¬ 
premise systems, and there were no 
IaaS aspects to the initial offerings at 
all. If these are your current perspec¬ 
tives, then you should suspend judg¬ 
ment based on all you have heard up 
until now, because if you did not hear it 
directly from Microsoft in the last six 
months, then it is time to revisit what 
Azure can do for you. There will still be 
situations where other cloud platforms 
better fit your needs, but at least now 
Windows Azure is a contender. 

Why the cloud? 

The promises of the cloud are mostly 
found in the economies of scale that 
can be had when many organizations 
share the core costs of data centers 
along with the efficiencies of virtualiz¬ 
ing system loads. Microsoft has pushed 
aggressively to make its Hyper-V virtu¬ 


alization technology able to drive these 
cost savings in both the Azure cloud 
and in on-premise virtualized systems. 

En Pointe’s Shah pointed to several 
major themes that are driving his com¬ 
pany’s customers toward using the 
cloud—specifically Windows Azure—in 
their solutions. He felt that flexibility, 
scalability and efficiency were advan¬ 
tages everyone could understand with 
little trouble. When asked for an exam¬ 
ple of how flexibility and scalability are 
attained from using Azure, he said, 
“Being able to spin up servers at the 
drop of a dime, rather than spending 
time architecting an on-premise build.” 

His point highlights how easy it has 
become to generate a server on the 
IaaS cloud platforms. This is not just 
true for Azure, but also with the cloud 
offerings from Amazon, Rackspace, and 
(later this year) VMware. The advan¬ 
tage Microsoft currently has is the abil¬ 
ity to choose from template VMs that 
do not need to be installed or uploaded 
for faster deployment. 

continued on page 50 ► 
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Flexibility like this is what drove the 
adoption of the first PC LANs as 
departments searched for ways to get 
out from under the thumb of IT. That 
new trend goes to show that everything 
is faster in the technology world, includ¬ 
ing how fast history repeats itself. Asked 
how he sees efficiency manifested in 
leveraging Azure, Shah said, “Always 
up, accessible anywhere, keeping your 
employees connected whether they are 
remote or in the office.” The cloud is 
just the next logical step for those that 
have already embraced virtualization. 
The advantages of virtualizing servers 
are very much the same as the advan¬ 
tages of leveraging cloud infrastructure. 

Which cloud? 

There are choices when it comes to 
cloud providers, with Amazon and Win¬ 
dows Azure being the most recognized 
names competing in the space today. 
Many of the features of Azure are 
strongly developer-centric and allow 
complex systems to he deployed with¬ 
out the need for strong IT support. 

To attract developers outside of 
Microsoft shops, the Azure team has 
enabled non-Microsoft technologies 
such as PHP and a whole raft of others 
on the platform. I talked to Bruce Bac- 
ka, CEO of NTP Software, about why 
he prefers Azure to Amazon, and he 
said, “While Amazons EC2 compute 
model has changed the world, it has one 
really harsh embedded assumption: that 
your business started yesterday.” He 
went on to explain that, “for most of the 
27 million businesses in America, that’s 
not true. Azure provides a platform for 
businesses with pre-existing applica¬ 
tions and data to smoothly integrate the 
cloud and scale out globally.” 

This does not apply to the many start¬ 
ups have begun their systems on Amazon 
and are very happy with it. Organizations 
that have a large database or on-premise 
system that has to be part of the solution 
can integrate those systems with Azure 
now. One place where Microsoft has an 
advantage in on-premise integration is in 
allowing the organizational Active Direc¬ 
tory to federate with Azure, allowing on¬ 
premise provisioned accounts to he used 


seamlessly to authenticate cloud-based 
systems. 

Azure also has the benefit of provid¬ 
ing options for migrating systems from 
the cloud to on-premise and back again, 
thanks to Hyper-V being the basis for 
both the Azure platform itself and 
being available in Windows Server 2012 
in your own data center. Microsoft also 
takes full advantage of including the 
operating system license costs as part of 
the offering, so if you need to spin up 
12 Windows Server instances for a load 
test on any other cloud, you need to 
have the licenses. But on Azure, they 
are included. If the systems in question 
are Linux, then this is a non-issue, but a 


lot of the world does run on Windows 
and needs to be tested on Windows. 

PaaS vs. laaS 

IaaS has been available for a long time 
in the form of Web hosting and server 
hosting. Backspace and Amazon have 
been leading the charge in innovating 
this form of cloud platform in ways that 
make it way easier to use than previous 
hosting providers. 

As mentioned earlier, Microsoft has 
been adding IaaS options to its initial 
offerings in realization that neither IaaS 
nor PaaS fit all needs. PaaS is about 
removing the IT overhead and chores 
continued on page 53 ► 
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◄ continued from page 50 

from the equation entirely, as there is 
not even a VM to upload, configure or 
patch. With PaaS, the developer clicks 
to the environment and deploys the 
solution in a way more like a Lego sys¬ 
tem than enterprise architecture. This is 
not to trivialize the result or skill 
required to envision and implement, 
but it does accelerate and streamline 
maintenance in ways that IaaS does not. 

Benjamin Day, owner of Benjamin 
Day Consulting, has been a technical 
authority on Azure since it was first 
announced. He said, “Azure’s PaaS offer¬ 
ing has always seemed like a big win for 
teams because you just worry about your 
application code, and then the rest of the 
details are taken care of by Azure. Basi¬ 
cally, its Web hosting on steroids.” 

If Windows Azure is ultimately the 
winner of the cloud platform wars, it will 
likely be due to the superiority of PaaS 
as a model for cloud adoption rather 
than any astute maneuvering done by 
Microsoft over Amazon. But it doesn’t 
hurt that Microsoft is playing in the IaaS 
space as well. 

Making sense of dollars 

As mentioned earlier, one of the 
biggest advantages of taking 
operations to the cloud is cost 
efficiency. In each round of 
conferences since Windows 
Azure was first announced. 
Microsoft has ratcheted down the 
rates. As Benjamin Day pointed out, 
“The recent announcements [at 
June’s TechEd conference] about 
Azure virtual-machine billing and the 
announcements from the Visual Studio 
team about cloud-based load testing 
should be extremely exciting for devel¬ 
opment teams.” 

The big news is that virtual 
machines uploaded to Azure but not 
actively running will no longer incur 
any costs. Another change is that VMs 
are billed by the minute instead of by 
the hour. That could really add up, 
especially for testing environments and 
the like where the virtual machine is 
stopped and started often. 

When Windows Azure was first 
announced, the billing was problemat¬ 


IT vs. the cloud 


Development groups and system administrators have always had a strained relation¬ 
ship. In some cases this has been an advantage, since if IT and development colluded, 
that could streamline deploying backdoors and other bad things. With the advent of 
major organizations leveraging Azure, however, we are seeing different kinds of con¬ 
flict between the groups. Many IT organizations see the efficiencies of the cloud as 
bad for their longevity and have taken obstructionist positions. 

Workers of every stripe sometimes make mistakes that can plague their productiv¬ 
ity, but system administrators must be held to a higher standard because the ramifi¬ 
cations of mistakes on their part usually have a higher price tag. There are times 
where fear of making mistakes, especially with new technologies, can cause resist¬ 
ance on the part of these system administrators. 

Bad habits embedded in standard practices are less common than simple mistakes, 
but they carry a far greater cost. For example, most organizations will, as a standard 
practice, format systems that are no longer needed (a good security practice) and 
then build on a new OS image when and if the server hardware is pressed back into 
service. The few organizations that ignore this clear best practice cause real 
headaches for their developers who end up having a hard time predicting behaviors 
thanks to remnants from a server's last role. 

Virtualization has made this whole conversation go away for many organizations, 
but there are still some who will not only fail to format the system coming out of serv¬ 
ice, but will simply uninstall previous applications and install new ones. These are the 
same IT teams that are typically too lazy to even patch systems, and who treat service 
packs needed to support server applications to be a special request. 

Having worked in countless data centers myself, I can attest that while these hor¬ 
ror-show IT teams exist, they are not widespread. What is widespread is a consensus 
among admin staffs that developers always ask for superfluous requirements, as well 
as a general lack of understanding of what kinds of resources an enterprise applica¬ 
tion needs to get things done. The problem is that there is definitely an "Us vs. Them" 
mentality between IT staff and developers on a grand scale. Rather than viewing 
developers as customers to be served, they are gremlins to be thwarted. 

The PaaS option of Azure advertises that you can eliminate 
the IT staff from the equation, and that does tend to make IT 
people less enthusiastic about the prospect of PaaS being the 
way forward for their company. If you refer to Figure 1, the sav¬ 
ings highlighted are mostly at the expense of IT. It would be 
an oversimplification to say that Azure eliminates the 
need for an IT department, but not only is it not helping, 
it is clearly automating away work that is currently 
done by network administrators. I 

—Patrick Hynds 



ic. There was no good way to track what 
things were costing or would cost. 
There were many different values that 
were used to incur costs, and it seemed 
that you could easily and quickly be 
nickel-and-dimed to death. 

Over time, though, this has gotten 
much better, including clear billing 
insight via the Azure portal and the 
removal of some of the more onerous 
items from the virtual machine offering. 
Removing friction is important to adop¬ 
tion, especially for a new frontier like the 
cloud. En Pointe’s Shah noted that with 


Azure, “You can get up and running with 
just a credit card. And it’s pay as you go.” 

This last part is a critical point since 
traditionally you have to provision sys¬ 
tems for the highest level of usage and 
bear the cost of that level. And if you or 
your organization is an MSDN sub¬ 
scriber, then you already get monthly 
credits to use toward Azure services. 

The final price for the Web Site offer¬ 
ing for the Standard level (formerly 
called Reserved) showed that Microsoft 
Azure is not a cure-all. Web Sites can be 
continued on page 55 ► 
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a great service, but at US$10 per month 
for simple website hosting, it is not quite 
competitive with the $4 per month avail¬ 
able from GoDaddy and other tradition¬ 
al hosting providers. Microsoft has yet to 
finalize the pricing on the Web Site 
Shared level, which allows for most of 
what Standard offers, but with less scale 
and no SLA. 

Going from here to there 

The most common question from those 
starting to use Azure is where and how 
to start. For many, the easy way is to 
pick a website or Web-based applica¬ 
tion and put it up on Azure as a website. 

This will get you accustomed to 
using the Azure interface, which is 
available via the portal login at 
www.windowsazure.com. It is very easy 
to get a new or existing site up and run¬ 
ning once you have an Azure account 
set up. Figure 2 shows some of the 
blogging systems that can be rapidly set 
up via the gallery. 

Microsoft has provided choices for 
getting things done with Azure, including 
working through the portal wizards and 
deploying sites directly from Visual Stu¬ 
dio. You can also deploy your Web appli¬ 
cation from Git repositories such as 
GitHub or Team Foundation Services. 
The advantage of deploying from a 
source-code repository is that you can get 
continuous source integration such that 
as changes are made to the code, they get 
pushed automatically to the Azure-host¬ 
ed site. Once the site is deployed, the 
domain can be set to a vanity domain 
with a bit of DNS manipulation, and you 
can even leverage SSL. 

The next logical step depends on 
how your environment is currently con¬ 
figured in terms of virtualization plat¬ 
forms and systems being used. For 
example, if you need to add a Share- 
Point server to the mix of an existing 
solution, but want to avoid provisioning 
that server, you can set up a virtual 
machine to play that role, yet still inte¬ 
grate it into your on-premise systems. 
With Microsoft System Center, servers 
both onsite and in the cloud can be 
managed together rather seamlessly. 

The technology that allows for the 


integration between your on-premise 
systems and Azure is called the Service 
Bus, formerly called AppFabric. Service 
Bus Relay lets you build these hybrid 
solutions that span Azure and your own 
data centers. For example, you can use it 
to provide secure and reliable communi¬ 
cations between your systems via Web 
services. This allows you to surface data 
to the Azure solution, and to move the 
parts of the solution that work best for 
you to Azure while keeping others under 
your own roof. 

According to En Pointes Shah, 
“Many of our customers are taking a 
hybrid approach, which bridges the 
company’s infrastructure and leverages 
Azure. We are there to help companies 
understand what the best implementa¬ 
tion approach is for their company.” 

The catch, though, is that it is not triv¬ 
ial to make proper use of the Service 
Bus. It stands out as perhaps the most 
esoteric of the Azure mechanisms, 
thanks in no small part to its heavy 
reliance on WCF. When asked about this 
aspect, Day said, “If your app is already 
running in your data center and you’re 
happy with it, it’s not necessarily simple 
or risk-free to move it to the PaaS cloud.” 

Getting better all the time 

Since he attended TechEd in New 
Orleans, I asked Day what he thought 
would be the thing that got organiza¬ 
tions over the hurdle of using Azure. 
His answer was pretty compelling: “It’s 


all about removing friction in the devel¬ 
opment process. It’s all about removing 
distractions from the development 
process. The new cloud-based load 
testing from Team Foundation Service 
does just that.” 

He continued, “Organizations often 
want to load test their application, but 
can’t mentally get over the hurdle of 
provisioning the hardware. In order to 
do load testing internally, you probably 
will want a minimum of three to five 
servers that will run the Visual Studio 
Load Test Controllers and Agents. If 
you’re part of the development team, 
convincing your internal IT organization 
to give you one server is often close to 
impossible, let alone three to five. Now 
if something is difficult or feels like it’s 
going to be a black hole, how likely is it 
that you’re going to do it? Answer: not 
likely. But you still know that you should 
load test your Web applications, right? 

“Well, you could do this with Azure 
IaaS VMs, but the Visual Studio team 
has just announced their cloud-based 
load-testing offering. With their load¬ 
testing service, you don’t worry about 
hardware or operating systems, and you 
don’t worry about configuring the [on- 
premise] Load Testing Read this story on 
services. You just con- sdtimes.com 
nect to their Load 
Testing service using 
Visual Studio and start 
your load test. Done 
and done.” I 
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Code Watch 

BY LARRY O'BRIEN 


Functional JavaScript 


J avaScript has emerged as a language in which 
every programmer needs to be competent, if 
not masterful. It is the lingua franca of the Web 
client, and while alternatives such as Microsoft’s 
TypeScript and Google’s Dart are worth watching, 
they are not anywhere close to “crossing the 
chasm” from early adoption to majority use. 

JavaScript is a language that’s problematic. Bren¬ 
dan Eich’s heroic effort to embed a language in 
1995’s Netscape Navigator is justifiably legendary, 
but the semantics of the language contain any num¬ 
ber of things that make you say “Wut?” (“The only 
proper response to something that makes no possi¬ 
ble sense”). Although an unfair comparison, it’s hard 
not to smirk a little at the difference in thickness 
between Douglas Crockford’s “JavaScript: The 
Good Parts” (176 pages) and David Flanagan’s 
“JavaScript: The Definitive Guide” (936 pages). 

For a variety of reasons, for a decade JavaScript 
was not shoeboxed into any particular program¬ 
ming style: Programmers rarely argued about the 
architecture and design of JavaScript modules, 
which were considered more of a “git ‘er done” 
tool to gain some functionality on the Web page. 

Rather than a network of interconnecting 
objects, most JavaScript Web pages relied on writ¬ 
ing event-handling functions, which could be cre¬ 
ated inline, anonymously. This was frequently 
abused, and it was a common sight to open a Web 
page and find a vast tree of functions nested within 
functions nested within functions without the 
slightest rhyme or reason. 

That changed with the release (and exploding 
popularity) of jQuery in the mid-2000s. jQuery 
popularized passing and returning (rather than 
nesting) functions. Users were, unknowingly, 
becoming familiar and comfortable with functional 
programming techniques. 

I’ve talked about functional programming a good 
deal in this space, generally in the context of lan¬ 
guages such as Haskell, Scala and F#. These lan¬ 
guages all feature sophisticated static type systems, 
and while I can argue both sides of the “static vs. 
dynamic typing” coin (on any given day I generally 
see the merits in whichever approach I’m not using), 
it seems a little strange to promote JavaScript as a 
functional programming language from the get-go. 
Not to Michael Fogus, author of “Functional 


JavaScript,” and Reginald Braithwaite, author of 
“JavaScript Allonge.” These two books cover much 
of the same ground, assuming that the reader is 
comfortable in JavaScript (better still if they’ve 
digested “JavaScript: The Good Parts”) but not 
versed in functional programming. This allows 
both books to focus on the core issue of teaching 
functional concepts (although some language 
details, such as JavaScript’s confusing “truthy” 
semantics, inevitably crop up) and highlighting the 
benefits of functional approaches. 

The books have very different tones. Braith- 
waite’s is breezier and covers more ground, but 
risks having the less-attentive reader lose the trail. 
Fogus’ book is more conscientious to the details, 
which can make the text a little dry at times, but 
may provide a more solid foundation for the read¬ 
er. I cannot recall two books that struck me as so 
equal in their technical content 
and accuracy while having such 
disparate styles. 

Both authors emphasize prag¬ 
matics over programming-lan¬ 
guage theory, but the scope of 
both books is deep enough so 
that there’s no avoiding sentences 
that are difficult to casually unpack (“Apply is a 
method that is implemented by every function that 
takes a context as its first argument, and it takes an 
array or array-like thing of arguments as its second 
argument”). Both books do a good job of presenting 
many source-code examples and practical recipes. 
Neither book pretends to be a comprehensive text 
on functional programming. Both authors write 
clearly and accurately. 

Even with both books at hand and fresh in my 
mind, when I had a small challenge yesterday (con¬ 
verting an array of strings holding file paths into a 
tree-like data-structure), I hacked it with an F# 
REPF, not a JavaScript one. But although there 
are F#-to-JavaScript compilers, I wouldn’t expect 
to use F# or Scala or Haskell inside an enterprise 
application. JavaScript is, for the foreseeable 
future, the language of the browser. Every devel¬ 
oper who writes Web applications should have, on 
their shelves, a copy of Crockford’s “JavaScript: 
The Good Parts,” and one—if not both—of 
“JavaScript Allonge” or “Functional JavaScript.” I 
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Guest View 

BY KRIS BARKER 

Be part of the SAM 

W hen it comes to software license compliance, 
publishers and end users may not be as far 
apart as we think. Publishers worry about revenue 
leakage. End users worry about audits. But for both 
parties, the ability to document and confirm compli¬ 
ance across an increasingly diverse array of architec¬ 
tures, platforms and devices is the ultimate goal. And 
better, smarter automation may be the key to making 
both parties happier, and happier with each other. 

A strong sense of mutual purpose was the most 
striking—and surprising—impression I came away 
with from the 2013 Compliance Manager Summit. 
This event brought together software publishers 
and compliance professionals to discuss trends in 
software licensing and pricing, and to share best 
practices and technologies for enforcing license 
compliance. 

The session that interested me the most was a 
panel made up of end-user com¬ 
panies that discussed their insights 
as a result of their efforts to main¬ 
tain license compliance and sur¬ 
vive vendor audits. The panelists 
represented a number of high- 
profile companies such as Visa, 
Wells Fargo and Kaiser Perma- 
nente. All of them had undergone audits. All were 
serious about compliance and had made a formal, 
long-term commitment to managing their compli¬ 
ance issues intelligently and cost-effectively. 

In fact, everyone in the room, including publish¬ 
ers, agreed that these companies were so advanced 
in their software asset-management (SAM) pro¬ 
grams and practices that they represented the elite 
few. I have taken to labeling these companies the 
“One Percent”—not just because they are among the 
very rare enterprises that are ahead of the curve in 
truly understanding and reaping the benefits of SAM 
best practices, but also because, unlike the other 
99%, they can clearly articulate the value of SAM. 

How has this handful of companies been able to 
distinguish itself from the 99% of companies that 
continue to struggle mightily with license audits 
and SAM? Based on what I heard at the confer¬ 
ence, here are what I believe are the two main fac¬ 
tors that make them stand apart: 

• They have learned to avoid full-blown software 
audits. Upon receipt of the initial audit letter, they 


'One Percent' 

provide vendors with accurate, high-level docu¬ 
mentation that shows they are compliant with the 
licenses in question. 

• They continually perform software consumption 
analysis and optimize their license portfolios, achiev¬ 
ing approximately 20% savings in their annual soft¬ 
ware spending. While many companies may be able 
to manage their compliance risk reasonably well, 
most have not yet taken this second step toward full 
SAM “enlightenment” by embracing regular, system¬ 
atic software-usage analysis. As a result, they’re 
absorbing licensing costs for products to go unused. 

We’ve known for quite some time now that 
companies that make a concerted effort to track 
application usage over time will find plenty of 
opportunities to reduce their software spending. 
Many of our own customers have told us this anec¬ 
dotally, but because of competing priorities, they 
rarely document those savings. By contrast, the 
“One Percenters” make a point of documenting 
their savings on a regular basis. 

How did these companies to find themselves 
among the “One Percent”? First, all have undergone 
painful, costly audits in the past. Second, in response 
to those experiences, their IT departments have 
been able to garner solid executive support for their 
SAM initiatives. Third, these companies have invest¬ 
ed significant resources in implementing airtight 
SAM technologies and processes. 

While many of those companies in the 99% have 
also undergone full-blown audits, most of them have 
failed to take the next steps that would help them 
avoid repeating that unpleasant experience. I recent¬ 
ly spoke with an asset manager at a large enterprise 
who still can’t convince members of the C-suite to 
make a long-term, strategic investment in software 
asset management—several months after the com¬ 
pany was nailed by Microsoft with more than a mil¬ 
lion dollars in fines for non-compliance! 

I suspect if executive decision-makers under¬ 
stood—or IT managers could more forcefully artic¬ 
ulate—that their organizations could potentially 
avoid software audits (or at least the most disas¬ 
trous of outcomes), as well as reduce software 
spending by tens or even hundreds of thousands of 
dollars, more organizations would truly embrace 
SAM. And, thankfully, the resulting benefits would 
be enjoyed by more than just the “One Percent.” I 
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Analyst View 


BY ROB ENDERLE 


How BlackBerry could rise again 


s we got to the end of the 1980s, IBM was all 
Lbut dead. It was on death watch, and it actu¬ 
ally had gone from having one of the most power¬ 
ful brands to one of the least powerful ones. It 
went from the tagline of “No one has ever lost their 
job buying from IBM” to “Buy from IBM and lose 
your job.” 

Apple was in even worse shape a decade later, 
and we were counting the weeks to when that com¬ 
pany would have to shut its doors forever. Chrysler 
has had several near-death experiences, but now it 
is (granted, as part of Fiat) a company to be reck¬ 
oned with. And AT&T literally came back from the 
dead to once again become dominant in the U.S. 
Companies that were once great can return; they 
just need to find a way back to greatness. Lets 
explore how some of these companies recovered to 
see if we can find a way back for BlackBerry. 

IBM: Who wants to live forever? 

IBM does. In fact, of all the companies well talk 
about, only IBM was designed at the start to be 
immortal—and they almost died despite that. 
Their problem was that they executed a lock-in 
strategy incredibly well, then concluded that, since 
customers couldn’t switch, they would pay whatev¬ 
er IBM charged for whatever IBM wanted to sell. 
Toward the end IBM, was charging a lot for what 
was basically bug fixes for unreliable products, and 
folks found a way to move. 

Louis Gerstner and Jerry York cut IBM to the 
bone, got rid of underperforming units, refocused 
on the customer, and created the strongest market¬ 
ing team ever built for technology. They then put 
marketing ahead of development, telling stories 
about products that didn’t exist yet as if they not 
only existed but were market-leading. 

The first wave IBM dominated was e-com¬ 
merce, and they owned the mindshare for this seg¬ 
ment long before they had a solution that was actu¬ 
ally competitive in it. They fixed the perception 
first, then they fixed the company. 

Apple: Creating magic from mulch 

When Steve Jobs took over Apple, he had been 
outspoken that he wouldn’t touch Apple’s products 
at the time with a 10-foot pole and rubber gloves. 
He was embarrassed about how bad they were. 


Apple had drifted away from the amazing company 
he had helped create to one that appeared to be 
chasing companies like Dell, IBM and HP—and 
doing a horrid job of it. They didn’t have a negative 
brand image like IBM, but after the launch of Win¬ 
dows 95, they didn’t seem to have much of a reason 
for existing either. 

But once Jobs took over, he put a massive effort 
into fixing Apple’s image first. Initially he got a 
US$100 million cash infusion from Microsoft, 
which served as a vote of confidence to other 
investors, increasing the cash availableb and giving 
him a war chest—much of which went to market¬ 
ing and the hunt for something no one else had. 

With marketing, he presented Apple’s lame 
products as something special, and he maintained 
sales until he could get them redesigned back into 
products he could be proud of. Eventually he 
found the iPod, which he 
brought in from the outside to 
drive the recovery. But if he had¬ 
n’t stabilized sales first, he never 
would have gotten to the iPod, 
which had a massive marketing 
campaign associated with it and a 
tight focus on user experience, a 
formula that followed into the iPhone and iPad. 



Rob Enderle is a 
principal analyst at 
the Enderle Group. 


BlackBerry's success is not in 
being a good follower, but in 
creating a new offering and 
driving the market to them. 


Wrapping up: BlackBerry 

There is no doubt BlackBerry is spending a ton on 
marketing, but they aren’t leading a parade like 
IBM did with e-commerce, nor have they found an 
iPod product like Apple did. They are still largely 
chasing the iPhone and Android products, but with 
a unique business focus that isn’t resonating at the 
moment. They have advantages through QNX-to- 
Automotive connectivity, and they could anticipate 
wearable technology or some other market move 
to create the next iPod-like product. 

In the end, BlackBerry’s success likely lies not 
in them trying to be a good follower, but in creat¬ 
ing a new offering, and then, like Apple and IBM, 
driving the market to them as the leader of that 
unique segment/market. 

In the end, that is how the BlackBerry came to 
market in the first place. It was never a “me too” 
product; it was the “it” product, and to succeed, 
the firm has to create another one. I 
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Industry Watch 


BY DAVID RUBINSTEIN 


Why all the hand-wringing? 


David Rubinstein is 
editor-in-chief of SD Times. 


file have this fascination with 
over-scrutinizing companies, 
to see if we can read 
something into our culture. 
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O ne of the hot topics buzzing on the Web in 
the past couple weeks has been the idea that 
Bill Gates is planning to return to Microsoft in a 
more active day-to-day operational role. Talk about 
your reorganizations! 

While the rumor has since been widely dismissed 
as wishful thinking by Microsoft shareholders 
rocked by the near billion-dollar write-down of costs 
associated with the Surface RT tablet, Microsoft 
continues to struggle to find its footing in the post- 
PC world it helped to identify and define. 

Microsoft hasn’t helped itself in this matter, 
with a never-ending stream of moves that give the 
appearance that the company is 
floundering. Windows 8 was a 
bold departure that would revo¬ 
lutionize how we interact with 
our computers. Windows 8.1 
caves in to those who complained 
they couldn’t find their desktop 
applications. 

The company has been talking about how its soft¬ 
ware will change the way information workers do 
their jobs. Yet we still do not see a tight Yammer- 
SharePoint integration, and Microsoft’s cloud 
emphasis does not resonate with many customers 
uncomfortable with hosting data outside the firewall. 

Then there is the reorganization. Microsoft 
announced last month that it is dissolving eight prod¬ 
uct units into four, in part to eliminate redundancies 
in engineering efforts as well as to more clearly align 
divisions with hardware, software and services. The 
notorious fighting between divisions for resources 
and credit for innovation is not to be part of the new 
Microsoft culture, CEO Steve Ballmer has decreed. 

All of which gives the impression that Microsoft 
is struggling mightily. 



But can we pause for a reality check here? 
In its latest financials, Microsoft reported quar¬ 
terly net income of US$5 billion on sales of almost 
$20 billion. 

Meanwhile, it’s not as if Apple and Google 
haven’t had their share of troubles. Apple has had 
to deal with the perception that innovation and 
creativity died along with founder Steve Jobs. It is 
getting waxed by Android in the smartphone wars, 
while recent hacks and downtime have exposed its 
infrastructure as vulnerable. And Google’s stock 
took a hit when the company reported that its cost- 
per-click price declined 6% year over year. 

We have this fascination with over-scrutinizing 
these companies, wringing our hands to see if we 
can read something into our culture, our economy 
and our sense of self through their performance. 

Here’s my takeaway: Among the three, there is 
almost $1 TRILLION in market capitalization. 
Profits for one quarter alone are in the tens of bil¬ 
lions. We continue to consume everything they’re 
offering up at a phenomenal rate, and can’t wait to 
get our hands on the next thing out the door. 
Google Glass? Chromecast? The Xbox One? 
Apple’s new iOS 7 and OS X Mavericks? 

Sign us up for all of it. I 
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Sept. 3-5 Qualcomm Uplinq 

CITY 

San Diego 

SPONSOR 

Qualcomm 

LINK 

www.uplinq.com 

Sept. 10-11 

Intel Developer Forum 

San Francisco 

Intel 

intel.com/idf 

Sept. 16-19 

Storage Developer Conf. 

Santa Clara 

SNIA 

www.snia.org/events/storage-developer2013 

Sept. 22-26 

JavaOne 

San Francisco 

Oracle 

www.oracle.com/javaone 

Sept. 22-26 

Oracle OpenWorld 

San Francisco 

Oracle 

www.oracle.com/openworld 

Sept. 27-29 

ISVCon 

Reno, Nev. 

ISVCon 

www.isvcon.org 

Sept. 29-Oct. 4 

STARWEST 

Anaheim 

S0E 

starwest.techwell.com 


For a more complete calendar of U.S. software development events, see www.sdtimes.com/content/eventcalendar.aspx. Send news about upcoming events to events@bzmedia.com. 
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Big Data gets real 
at Big Data TechCon! 



Discover how to master Big Data from real-world practitioners - instructors 
who work in the trenches and can teach you from real-world experience! 


Come to Big Data TechCon 
to learn the best ways to: 


• Collect, sort and store massive quantities 
of structured and unstructured data 

• Process real-time data pouring 
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• Master Big Data tools and 
technologies like Hadoop, Map/Reduce, 
NoSQL databases, and more 
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• Learn HOW TO integrate data-collection 
technologies with analysis and business-analysis 
tools to produce the kind of workable information 
and reports your organization needs 

• Understand HOW TO leverage Big Data to help 
your organization today 


BigData 

— TECHCON 

San Francisco 


October 15-17,2013 

www.BigDataTechCon.com 


The HOW-TO conference for Big Data and IT professionals 



Big Data TechCon™ is a trademark of BZ Media LLC. 















Welcome! 


2 • October 15-17,2013 • San Francisco • www.BigDataTechCon.com 


The HOW-TO conference for Big Data 
and IT Professionals! 

• Learn tips, tricks and techniques that will make you 
your company’s Big Data Expert! 

• Discover how to master Big Data from real-world 
practitioners—instructors who work in the trenches 
and can teach you from real-world experience 

• Hear about other related technologies that can help you 
with your Big Data projects: the cloud, efficient storage 
and warehousing methods, and more 

• Come to Big Data TechCon to master Big Data—get 
practical answers to real problems, learn tangible steps 
to real-world implementation 
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Dear colleague, 

Big Data is affecting all of us and may well be the future of 
computing as we know it. Companies and IT professionals who 
make the most of this new field are sure to prosper over the next 
5-10 years. Many conferences exist to trumpet the 
potential and fan the hype surrounding Big Data. 

But until now, there has been no conference that 
teaches you HOW to do it. 

Big Data TechCon is the HOW-TO conference for 
Big Data. Featuring practical tutorials and more than 
Ted Bahr 50 technical classes to choose from, Big Data Tech- 
Conference Con j s the biggest, most info-packed, most practical 
Chairman HOW-TO Big Data conference in the world. No hype. 
All tech, all the time. What else makes Big Data TechCon special? 

• Most of our speakers have been thoroughly vetted using 
evaluations from attendees at previous Big Data TechCons on 
the quality of information presented, as well as the ability to 
state it clearly and produce clear takeaways that you can apply 
in your business today. 

• Pull together your own custom conference by choosing from up 
to six classes in any given timeslot. Whether it’s a deep dive into 
Hadoop, a thorough introduction to Cassandra, or intensive 
classes on data analytics or machine learning, you put together 
the conference that works best for YOU. 

• Network with other technical IT professionals like yourself. 

Most of our attendees are software and data architects, software 
developers and engineers, data scientists, and business and data 
analysts, and there are great opportunities to talk with others 
facing the same challenges as you. Plus, there are meet-ups and 
other chances to meet and talk further with our expert speakers. 

• Great keynotes to inspire you; this conference features 
Doug Cutting, the founder of Hadoop! 

• Extra events for more networking: receptions, lunches, 
our ice cream social and the Women in Big Data Luncheon. 

• Check out cutting-edge technologies and solutions for Big Data 
in our exhibit hall and round out your three-day experience. 

Whether you are looking at dozens of terabytes or hundreds 
of petabytes, from Avro to ZooKeeper, Big Data TechCon has you 
covered! Bring two or more colleagues and save an extra $100 
each. Regardless, you will save the most off the full conference 
price if you register early. 

See you in San Francisco! 


Produced by 

BZ Media SDTimeS EigData 


“Big Data TechCon is loaded with great networking 
opportunities and has a good mix of classes with technical 
depth, as well as overviews. It’s a good, technically-focused 
conference for developers.” 

— Kim Palko, Principal Product Manager, Red Hat 
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Tuesday, October 15 

7:30 am-6:30 pm 

Registration Open 

7:30 am-8:30 am 

Morning Coffee 

8:30 am - 10:00 am 

Tutorials 

10:00 am - 10:15 am 

Coffee Break 

10:15 am- 12:15 pm 

Tutorials 

12:15 pm- 1:15 pm 

Lunch Break 

1:15 pm-3:00 pm 

Tutorials 

3:00 pm-3:15 pm 

Coffee Break 

3:15 pm-5:00 pm 

Tutorials 

5:15 pm - 6:30 pm 

Lightning Talks 

Wednesday, October 16 

7:30 am-7:00 pm 

Registration Open 

7:30 am-8:30 am 

Morning Coffee 

8:30 am-9:30 am 

Technical Classes 

9:30 am-9:45 am 

Coffee Break 

9:45 am - 10:45 am 

Keynote 

11:00 am - 12:00 pm 

Technical Classes 

12:00 pm-7:00 pm 

Exhibit Hall Open 

12:15 pm - 12:45 pm 

Sponsored Classes 

12:45 pm - 1:45 pm 

Lunch Break 

12:45 pm - 1:45 pm 

Women in Big Data Luncheon 

1:45 pm-2:45 pm 

Technical Classes 

2:45 pm-3:15 pm 

Coffee, Ice Cream in Exhibit Hall 

3:15 pm-4:15 pm 

Technical Classes 

4:30 pm-5:30 pm 

Keynote 

5:30 pm-7:00 pm 

Networking Reception in Exhibit Hall 

7:15 pm-8:45 pm 

Fireside Chats 

Thursday, October 

17 

7:30 am-4:00 pm 

Registration Open 

7:30 am-8:45 am 

Morning Coffee 

8:45 am-9:45 am 

Technical Classes 

10:00 am - 11:00 am 

Keynote - Doug Cutting 

11:00 am-3:30 pm 

Exhibit Hall Open 

11:00 am - 11:30 am 

Coffee Break in Exhibit Hall 

11:30 am - 12:30 pm 

Technical Classes 

12:45 pm - 1:45 pm 

Lunch Break 

1:45 pm-2:45 pm 

Technical Classes 

2:45 pm-3:15 pm 

Coffee Break & Prizes in Exhibit Hall 

3:30 pm-4:30 pm 

Technical Classes 

4:30 pm 

Conference Closes 


Keynotes 

Thursday, October 17 

10:00 am-11:00 am 

Doug Cutting 

Founder of Hadoop 

Doug Cutting is the creator of numerous successful 
open-source projects, including Lucene, Nutch and 
Hadoop. Doug joined Cloudera in 2009 from Yahoo, 
where he was a key member of the team that built 
and deployed a production Hadoop storage and analysis cluster for mission- 
critical business analytics. Doug holds a Bachelor’s degree from Stanford 
University and sits on the Board (and is currently chairman) of the Apache 
Software Foundation. 
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Special Events 


BigData 

— TECHCON 


Tuesday, October 15 

5:15 pm -6:30 pm 

Lightning Talks 

Learn something new in a handful of short, targeted talks, 
PLUS names will be drawn for free giveaways. 

Wednesday, October 16 


9:45 am -10:45 am 

Keynote 


12:00 pm -7:00 pm 

Exhibit Hall Open 

See how the Big Data ecosystem is growing and evolving by 
speaking with technical experts in our Exhibit Hall. 


12:45 pm-1:45 pm Women in Big Data Luncheon Please join us for this special event filled with delicious food, 

wonderful networking opportunities and an open forum to 
discuss what it’s like being a woman in the Big Data industry. 
All women attendees are welcome! 



2:45 pm -3:15 pm 

Coffee, Ice Cream h 



in the Exhibit Hall 




"it P ■ i 

4:30 pm -5:30 pm 

Keynote 


5:30 pm -7:00 pm 

Networking Reception in the Exhibit Hall 


7:15 pm - 8:45 pm Fireside Chats 


m 


Thursday, October 16 


Managing Big Data Expectations 

As if working with petabytes and zettabytes (and yottabytes!?) of information isn’t difficult 
enough, you also have to think about managing the requirements and expectations of 
management. How do you deal with an unreasonable deadline? How best to anticipate 
the right budget? What about questionable requirements that could infringe on someone’s 
privacy, which has become a sensitive subject since the development of the National 
Security Agency’s surveillance program? These difficult scenarios and more will be covered 
in the Fireside Chats hosted by some of Big Data TechCon’s expert speakers, so come ready 
with questions and your own experiences of dealing with these situations. 


10:00 am-11:00 am ( ^ 

Keynote 

DOUg Cutting, Founder of Hadoop 

11:00 am -3:30 pm 

Exhibit Hall Open 

Come explore the latest in Big Data developer resources in 
our Exhibit Hall. 

11:00 am -11:30 am 

Coffee Break in Exhibit Hall 



2:45 pm - 3:15 pm Winner’s Circle prizes announced in the Exhibit Hall 
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Conference Planner 


BigData 

— TECHCON 


TECHNICAL LEVELS 

■ Overview 

■ Intermediate 

■ Advanced 


Tuesday, October 15 Click or touch class title to read description. 


TIME 

TITLE 

INSTRUCTOR 

TECHNICAL 

LEVEL 

7:30 am -6:30 pm 

Registration Open 



8:30 am-12:15 pm 

MORNING TUTORIALS (10:00 am -10:15 am Coffee Break) 



FULL-DAY TUTORIAL 

Data Science in a Spreadsheet: Learning What’s Really Going on 

in Those Black-Box Models 

John Foreman 

■ 

FULL-DAY TUTORIAL 

Hadoop: A One-Day, Hands-On Crash Course 

Sameer Farooqui 

■ 

HALF-DAY TUTORIAL 

Cascading Tutorial 

Paco Nathan 

■ 

HALF-DAY TUTORIAL 

Engineering Your Approach to Big Data Solutions 

Tony Shan 

■ 

HALF-DAY TUTORIAL 

Getting Started with Cassandra 

Ben Coverston 

■ 

HALF-DAY TUTORIAL 

Introduction and Best Practices for Storing and Analyzing Your Data 
with Apache Hive 

Mark Grover 

■ 

12:15 pm-1:15 pm 

Lunch Break 



1:15 pm -5:00 pm 

AFTERNOON TUTORIALS (3:00 pm - 3:15 pm Coffee Break) 



FULL-DAY TUTORIAL 

Data Science in a Spreadsheet: Learning What’s Really Going on 

in Those Black-Box Models 

John Foreman 

■ 

FULL-DAY TUTORIAL 

Hadoop: A One-Day, Hands-On Crash Course 

Sameer Farooqui 

■ 

HALF-DAY TUTORIAL 

Cassandra + S3 + Hadoop = Quick Auditing and Analytics 

Anton Yazovskiy 

■ 

HALF-DAY TUTORIAL 

NoSQL for SQL Professionals 

Dipti Borkar 

■ 

HALF-DAY TUTORIAL 

Programming with Scalding and Algebird 

Krishnan Raman 

■ 


5:15 pm-6:30 pm Lightning Talks 
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TECHNICAL LEVELS 

■ Overview 

■ Intermediate 

■ Advanced 


BigData 

— TECHCON 


Wednesday, October 16 Click or touch class title to read description. 

TIME TITLE 

INSTRUCTOR 

TECHNICAL 



LEVEL 

7:30 am-7:00 pm Registration Open 

8:30 am - 9:30 am Analytics Maturity Model 

John A. De Goes 

■ 

Apache Cassandra—A Deep Dive 

Ben Coverston 

■ 

Extending Your Data Infrastructure with Hadoop 

Jonathan Seidman 

■ 

HBase Use Cases 

Justin Hancock 

■ 

How to See and Understand Big Data 

Jock Mackinlay 

■ 

Pattern: A Machine-Learning Library for Cascading, 

Paco Nathan 

■ 

Migrating PMML Models to Hadoop 



10:00 am -10:15 am Coffee Break 

11:00 am -12:00 pm Graph Database Use Cases 

Max De Marzi 

■ 

HBase Schema Design Done Right 

Michael Segel 

■ 

Implementing a Simple Mongo Application 

Deep Mistry 

■ 

In-Database Predictive Analytics 

John A. De Goes 

■ 

Large-Scale, High-Accuracy Entity Extraction Made Easy 

Tim Furche 

■ 

The Hadoop Ecosystem: Putting the Pieces Together 

Jonathan Seidman 

■ 

12:00 pm-7:00 pm Exhibit Hall Open 

12:15 pm-12:45 pm Sponsored Classes 

12:45 pm -1:45 pm Lunch Break, Women in Big Data Luncheon 

1:45 pm - 2:45 pm Analyzing Tweets with HBase, Part 1 

Sameer Farooqui 

■ 

Intro to Machine Learning: A Crash Course, Part 1 

Paco Nathan 

■ 

Introduction to Apache Pig, Part 1 

Jeffrey Breen 

■ 

Running, Managing and Operating Hadoop at Sears 

Justin Sheppard 

■ 

Understanding MongoDB: New Features Explored Through Code 

Jonathan Freeman 

■ 

Untangling the Relationship Hairball with a Graph Database 

Max De Marzi 

■ 

Using Hadoop to Lower the Cost of Data Warehousing: 

Dave Jespersen 

■ 

The Paradigm Shift Underway 



2:45 pm - 3:15 pm Coffee, Ice Cream in Exhibit Hall 

3:15 pm - 4:15 pm Analyzing Tweets with HBase, Part II 

Sameer Farooqui 

■ 

Data Modeling and Relational Analysis in a NoSQL World 

Michael Miller 

■ 

Intro to Machine Learning: A Crash Course, Part II 

Paco Nathan 

■ 

Introduction to Apache Pig, Part II 

Jeffrey Breen 

■ 

Running Mission-Critical Applications on Hadoop 

Dave Jespersen 

■ 

Staying Alive: Ensuring Service Availability in Hadoop 

Vinithra Varadharajan 

■ 

Storm: Real-Time Data Processing at a Massive Scale 

Jill Jacobucci 

■ 

4:30 pm-5:30 pm Keynote 

5:30 pm - 7:00 pm Networking Reception in Exhibit Hall 

7:15 pm-8:45 pm Fireside Chats 
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TIME 

TITLE 

INSTRUCTOR 

TECHNICAL 

LEVEL 

7:30 am -4:00 pm 

Registration Open 



8:45 am-9:45 am 

Building an Impenetrable ZooKeeper 

Kathleen Ting 

■ 


Data Modeling for Chat Messages with Cassandra 

Ameet Chaubal 

■ 


First Steps to Big Data from MySQL 

Dave Stokes 

■ 


Hadoop by Example 

Serge Blazhievsky 

■ 


Managing a World of Data: Geospatial Best Practices 

Norman Barker 

■ 


Proper Care and Feeding of HBase Coprocessors 

Michael Segel 

■ 

10:00 am -11:00 am 

Keynote - Doug Cutting 



11:00 am -3:30 pm 

Exhibit Hall Open 



11:00 am -11:30 am 

Coffee Break in Exhibit Hall 



11:30 am-12:30 pm 

A/B Testing in a Big Data Environment 

Gabor Melli 

■ 


Building Your Own Facebook Graph Search with Cypher and Neo4j 

Max De Marzi 

■ 


Introduction to Parallel Iterative Machine-Learning Algorithms 
on Hadoop’s Next-Generation YARN Framework 

Josh Patterson 

■ 


Making it Real: Leveraging Big Data to Solve Big Problems 

Siva Vaidyanatha 

■ 


Real-Time Hadoop 

Michael Segel 

■ 


Seven Deadly Hadoop Misconfigurations 

Kathleen Ting 

■ 


Time + Space + Transaction: Multidimensional Geospatial 

Analysis with Hadoop 

Dan Rosanova 

■ 

12:45 pm-1:45 pm 

Lunch Break 



1:45 pm -2:45 pm 

Building Applications That Predict User Behavior Through 

Big Data Using Open-Source Technologies 

Simon Chan 

■ 


Getting Started with R and Hadoop, Part 1 

Jeffrey Breen 

■ 


Hadoop Backup and Disaster Recovery 101 

Jairam Ranganathan 

■ 


Simple Yet Efficient Web Extraction with OXPath, Part 1 

Tim Furche 

■ 


Windows Azure HDInsight and PowerPivot: Cloud-Based Data Analysis 
with Familiar and Friendly Tools 

Dan Rosanova 

■ 

2:45 pm -3:15 pm 

Coffee Break & Winners Circle Prizes in Exhibit Hall 



3:30 pm -4:30 pm 

Getting Started with R and Hadoop, Part II 

Jeffrey Breen 

■ 


Hadoop Design Patterns 

Serge Blazhievsky 

■ 


Large-Scale Distributed Queries with Apache Drill 

Jacques Nadeau 

■ 


Selecting the Right Big Data Tool for the Right Job, 
and Making It Work for You 

Eddie Satterly 

■ 


Simple yet Efficient Web Extraction with OXPath, Part II 

Tim Furche 

■ 


4:30 pm Conference Closes 
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Full-Day Tutorials 
8:30 am - 5:00 pm 

Data Science in a Spreadsheet: Learning What’s Really 
Going on in Those Black-Box Models ESI 

John Foreman 

This full-day tutorial will provide lessons on analytics practices 
every data scientist should understand. The tutorial will first in¬ 
troduce mathematical programming, then clustering and outlier 
detection (unsupervised learning), then forecasting, Monte Carlo 
simulation, and supervised AI modeling. The nitty-gritty of each 
practice will be demonstrated using spreadsheets, which you can 
download, follow along with, and keep for later reference. 

Full disclosure: These spreadsheets accompany the chapters in 
the instructor’s book “Data Smart,” which will be released around 
the same time of the conference. You will not need the book, just 
the spreadsheets. 

Level: Intermediate 

Hadoop: A One-Day, Hands-On Crash Course WM MS 

Sameer Farooqui 

This full-day tutorial is a fast-paced, vendor-agnostic technical 
overview of the Hadoop landscape, and is targeted at both techni¬ 
cal and non-technical people who want to understand the emerg¬ 
ing world of Big Data, with a specific focus on Hadoop. You will be 
introduced to the core concepts of Hadoop, and dive deep into 
the critical paths of HDFS, Map/Reduce and HBase. You will also 
learn the basics of how to effectively write Pig and Hive scripts, 
and how to choose the correct use cases for Hadoop. During the 
tutorial, you will have access to an individual one-node Hadoop 
cluster in Rackspace to run through some hands-on labs for the 
five software components: HDFS, Map/Reduce, Pig, Hive and 
HBase. 

In each sub-topic, you will be provided links and resource rec¬ 
ommendations for further exploration. You will also be given a 
100-page PDF slide deck, which can be used as reference material 
after the course. PDFs will also be given out for the five short, 
hands-on labs. No prior knowledge of databases or programming 
is assumed. 

Note: You are required to bring a laptop. If you run into an 
issue during the hands-on portions, it is also not guaranteed the 
instructor will be available to help you troubleshoot. 

Level: Overview 


mail This icon indicates code will be shown in the session. 

Half-Day Tutorials 
8:30 am-12:15 pm 
Cascading Tutorial WM MS 

Paco Nathan 

The tutorial begins with a quick pre-flight check: Set up and 
test your environment, choosing to use either laptop or cloud. 
Well cover a brief history of Cascading and related open-source 
projects (Cascalog, Scalding, etc.), plus an overview of typical use 
cases. Then well build and run the simplest-possible Cascading 
app, using it to discuss definitions of the most commonly used 
components of data pipelines. 

Well explore some of the theory which supports the use of ab¬ 
straction layers for Hadoop: deterministic vs. non-deterministic 
query planners, aspects of functional programming, pattern lan¬ 
guage, literate programming, and the software engineering con¬ 
siderations of Hadoop system integration, operationalizing apps, 
and design patterns for bringing Enterprise teams together. 

Then well work through a progression of sample apps, each 
building upon the last to show more sophisticated pipelines and 
explore more components of Cascading (Word Count, Cus¬ 
tomized Operation, Joins at scale), along with comparisons to 
similar constructs in Hive and Pig. Well summarize with a full im¬ 
plementation of TF-IDF (search index) in Cascading, and show 
how to instrument and test the app. 

Branching out into other languages, we will compare Word 
Count also in Cascalog and Scalding, then work through exam¬ 
ples using ANSI SQL (Lingual) and PMML (Pattern). Well con¬ 
clude by reviewing a case study: Using Cascalog on Open Data 
from the City of Palo Alto. 

Prerequisites: Bash command line, some programming in Java, 


It’s a great conference about newer, emerging technologies. 

— Deependra Das, Sr. Analyst, Mayo Clinic 
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plus familiarity with Git/GitHub. 

Note: This class is part lecture and part hands-on; you are re¬ 
quired to bring a laptop. 

Level: Intermediate 

Engineering Your Approach to Big Data Solutions HEM EM 

Tony Shan 

This tutorial introduces Big Data Engineering (BDE), which is de¬ 
fined as the practical application of a systematic, disciplined, quan¬ 
tifiable approach to the analysis, design, construction, operation 
and maintenance of Big Data solutions. BDE is a holistic method fo¬ 
cusing on eight crucial areas: Methodology, Program, Governance, 
Resources, Quality, Risk Mitigation, KPI & Financials, and Practice. 

BDE also systematically addresses the life cycle of Big Data so- 
lutioning in 12 stages: Plan, Requirement, Analysis, Modeling, 
Platform, Design, Development, Integration, Testing, Runtime, 
Deployment, and Operation. Each of these 12 stages comprises 
individual elements as subdisciplines. For example, the NoSQL 
platform options include key-value, column-based, document- 
oriented, graph, NewSQL and in-memory stores. Case studies and 
working examples will be discussed in great detail in the session 
to illustrate the pragmatic use of BDE in real-world implementa¬ 
tions. Best practices and lessons learned are articulated as well.? 
Level: Intermediate 

Getting Started with Cassandra MS 

Ben Coverston 

Unless you have experience with Google BigTable, HBase or 
Cassandra, column-oriented databases are probably an enigma. 
Cassandra's data model is both simple and powerful. It takes 
some time to get used to the differences between the relational 
model and Cassandra's column-based model. 

Cassandra is not schema-less, but we do not model relation¬ 
ships in Cassandra either. Data Modeling in Cassandra usually 
consists of finding the best way to denormalize the data when 
you put the data in the database so that you can retrieve it quickly 
and efficiently. This workshop will prepare you for success when 
modeling your data. This tutorial will dive into Cassandra from a 
developer perspective and give you the tools you need to get 
started with Cassandra today. 

This tutorial will cover: 

• An introduction to Cassandra in the context of relational 
databases and non-relational alternatives 

• Best practices for modeling your data in Cassandra 

• Cassandra Query Language (CQL version 3) 

•Wide, and Composite Columns 

• Practical Examples 

• Anti-Patterns (things to avoid) 

For a more advanced look at Cassandra, attend the “Apache 
Cassandra—A Deep Dive" class. 

Level: Overview 


Introduction and Best Practices for Storing and Analyzing 
Your Data with Apache Hive EM 

Mark Grover 

This tutorial on Apache Hive will introduce Hive, as well as the 
best practices for storage and data analysis in Hive. Hive is an 
open-source data-warehousing system based on top of Apache 
Hadoop that lets you query, mine and analyze the data stored in 
Hadoop clusters using familiar SQL-like queries. 

This tutorial will go through a hands-on exercise on how users 
can use Hive queries to perform data analysis. Because not all 
analysis can be expressed using SQL-like queries, the workshop 
will cover how to write, test and use User Defined Functions and 
User Defined Aggregate Functions in Hive. This tutorial will then 
go through some of the best practices related to partitioning, 
bucketing and joining various datasets in Hive. 

You will also learn how to leverage other technologies in the 
Hadoop ecosystem, such as plugging in Map/Reduce scripts from 
Hadoop directly into their Hive queries, and how to how to inte¬ 
grate HBase with Hive to share the data across the two systems. 
The tutorial will wrap up with a question-and-answer session. 

Note: For this tutorial, you are required to bring in a laptop 
with Apache Hadoop and Apache Hive installed on it. The best 
and easiest way to get started is to download a Demo VM with 
Hadoop and Hive installed and configured on it. You may down¬ 
load such a Demo VM from ccp.cloudera.com/display/SUP¬ 
PORT/Cloudera's+Hadoop+Demo+VM+for+CDH4. VMware, KVM 
and VirtualBox images are available at that link as well. Also, 
please clone the Git repository at github.com/markgrover/bdtc- 
hive on the demo VM before you come to the tutorial. 

Level: Intermediate 


“Great networking opportunities. 

—TK Lee, Education Technologist, Penn State University 
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1:15 pm-5:00 pm 

Cassandra + S3 + Hadoop = Quick Auditing and Analytics 

QZH9 ULLid 
Anton Yazovskiy 

The Cassandra database is an excellent choice when you need 
scalability and high availability without compromising perform¬ 
ance. Cassandra’s linear scalability proven fault tolerance and 
tunable consistency, combined with its being optimized for write 
traffic, make it an attractive choice for performing structured log¬ 
ging of application and transactional events. But using a colum¬ 
nar store like Cassandra for analytical needs poses its own 
problems, problems we solved by careful construction of Column 
Families combined with diplomatic use of Hadoop. 

Our system needed to support both a high volume of struc¬ 
tured, distributed writes as well as broad analytical capabilities. 
Unlike SQL databases, Cassandra does not support ad hoc queries, 
and data typically needs to be properly structured and denormal - 
ized at write time. At the same time, decisions need to be made 
depending on how often the data is queried, how stale the data 
can be, and the allowable latency before results are returned. Our 
system handles these different use cases by delegating certain re¬ 
porting tasks to Hadoop while keeping some in Cassandra itself. 

This tutorial focuses on building a similar system from scratch, 
showing how to perform analytical queries in near real time and 
still getting the benefits of the high-performance database engine 
of Cassandra. The key subjects are: 

• The splendors and miseries of NoSQL 

• Apache Cassandra use cases 

• Difficulties of using Map/Reduce directly in Cassandra 

• Amazon cloud solutions: Elastic MapReduce and S3 

• “Real-enough” time analysis 

In particular, the tutorial dives into ways of handling different 
kinds of semi-ad hoc queries when using Cassandra, as well as 
the pitfalls in designing a schema around a specific analytics use 
case. Some attention will be paid to dealing with time-series data 
in particular, which can present a real problem when using Col¬ 
umn-Family or Key-Value store databases. 

Level: Advanced 

NoSQL for SQL Professionals 

Dipti Borkar 

With all of the buzz around Big Data and NoSQL (non-rela¬ 
tional) database technology, what actually matters for today’s SQL 
professional? Learn more in this tutorial about Big Data and 
NoSQL in the context of the SQL world, and get to what’s truly im¬ 
portant for data professionals today. In this tutorial, we will discuss: 

• The main characteristics of NoSQL databases 


• High-level architectural overviews of the most popular 
NoSQL databases 

• Differences between distributed NoSQL and relational databases 

• Use cases for NoSQL technologies, with real-world examples 
from organizations in production today 

Finally, we will drill down into Couchbase Server and its under¬ 
lying distributed architecture, with a hands-on tour of how 
NoSQL databases like Couchbase work in a production environ¬ 
ment, including online rebalancing while adding nodes to a clus¬ 
ter, indexing and querying, and cross-data-center replication. 
Level: Intermediate 

Programming with Scalding and Algebird HI 

Krishnan Raman 

This is a hands-on coding tutorial. We will code up a few Scald¬ 
ing programs in different domains: portfolio optimization, 
healthcare, cosine similarity and random forests. While Scalding 
looks like a thin Scala API atop Cascading, this appearance is de¬ 
ceptive. The power of Scala, combined with the mapping, group¬ 
ing and joining primitives in Scalding, along with the Algebird 
abstract algebra library, allow for a whole new level of flexibility 
with Big Data. Matrix operations in Scalding are powered by Alge¬ 
bird, and using large-dimension matrices as a primitive, we can 
tackle problems in diverse domains that employ linear algebra 
over very large datasets in a batch mode. 

Note: You are expected to have installed Scala, Scalding and Al¬ 
gebird on your laptop before the tutorial commences. Access the 
slides and Scala code here: github.com/krishnanraman/bigdata. 
Level: Advanced 


“If you’re in or about to get into Big Data, this is the 
conference to go to.” 

—Jimmy Chung, Manager, Reports Development, Avectra 
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8:30 am - 9:30 am 
Analytics Maturity Model 

John A. De Goes 

Every company is at a different stage in leveraging analytics to 
improve their operational efficiency and product offerings. In this 
class, you will learn an eight-stage analytics maturity model that 
companies can use to determine how far they are from the most 
analytical companies. 

Level: Intermediate 


Apache Cassandra—A Deep Dive EM 


Most Popular! 


Ben Coverston 

Recently, there has been some discussion about what Big Data 
is. The definition of Big Data continues to evolve. Along with vari¬ 
ety, volume and velocity (which the usual suspects handle well), 
other facets have been introduced, namely complexity and distri¬ 
bution. Complexity and distribution are facets that require a dif¬ 
ferent type of solution. 

While you can manually shard your data (Oracle, MySQL) or ex¬ 
tend the master-slave paradigm to handle data distribution, a 
modern Big Data solution should solve the problem of distribu¬ 
tion in a straightforward and elegant manner, without manual in¬ 
tervention or external sharding. Apache Cassandra was designed 
to solve the problem of data distribution. It remains the best data¬ 
base for low-latency access to large volumes of data while still al¬ 
lowing for multi-region replication. We will discuss how Cassandra 
solves the problem of data distribution and availability at scale. 

This class will cover: 

• Replication • The Read Path 

• Data Partitioning • Multi-Datacenter Deployments 

• Local Storage Model • Upcoming Leatures (1.2 and beyond) 

•TheWrite Path 

Lor the most benefit from this class, attend the “Getting Started 
with Cassandra” workshop. 

Level: Intermediate 


• How can I use my existing data-integration and business 
intelligence tools with Hadoop? 

• How can I use Hadoop to make my ETL processing more 
scalable and agile? 

We’ll illustrate this with an end-to-end example data flow 
using open-source and commercial tools, showing how data can 
be imported and exported with Hadoop, ETL processing in 
Hadoop, and reporting and visualization of data in Hadoop. You 
will also learn recent advancements that make Hadoop an even 
more powerful platform for data processing and analysis. 

Level: Intermediate 

HBase Use Cases EM 

Justin Hancock 

The class will be an overview of the use cases for HBase. There 
will be an initial overview of HBase and its architecture, recapping 
key concepts such as column families, region servers and master. 
From this point, we will then move onto the use cases HBase is 
best suited to, such as high write loads with fast lookups, and the 
types of application that can be developed on HBase, such as Time 
Series Databases. We will also describe some of the use cases that 
HBase is not suited to, which helps avoid dissonance between 
technology choice and solution requirements. This includes dis¬ 
cussion of why it isn’t suitable for relational analytics or OLTP 
Finally, a recap of things to consider before embarking on an 
HBase implementation project, which includes design activities, 
deployment, tuning and administration considerations. You will 
leave with a good overview of HBase and its use cases. These will 
be useful for making informed investigations of technology selec¬ 
tion, ensuring that if HBase is chosen, it will satisfy business and 
technical requirements. 

Level: Intermediate 


“The conference is great for learning about the theory, concepts 
and technology of Big Data.” 

—Waleed Sarwani, Founder Sarwani Systems 


Extending Your Data Infrastructure with Hadoop 


Most Popular! 


Jonathan Seidman 

Hadoop provides significant value when integrated with an ex¬ 
isting data infrastructure, but even among Hadoop experts there’s 
still confusion about options for data integration and business in¬ 
telligence with Hadoop. This class will help clear up the confusion. 
You will learn: 

• How can I use Hadoop to complement and extend my data in¬ 
frastructure? 

• How can Hadoop complement my data warehouse? 

• What are the capabilities and limitations of available tools? 

• How do I get data into and out of Hadoop? 
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Most Popular! 


How to See and Understand Big Data I 

Jock Mackinlay 

Visual analysis is an iterative process that exploits the power of 
the human visual system to help people work with all kinds of 
data. When data is big, people must overcome the challenges of 
wide data, tall data, and data from multiple sources, often coming 
in fast and furiously Attend this class to learn how people work¬ 
ing with data can address these challenges. The key technique is 
to use multiple coordinated views of data during visual analysis 
and storytelling with data. 

You’ll learn: 

• What research and practice have taught us about designing 
great visualizations and dashboards 

• Fundamental principles for designing effective coordinated 
views for yourself and others 

• How to systematically analyze data from multiple databases 
using your visual system 

• The instructor works for Tableau Software, a provider of data-vi¬ 
sualization solutions. 

Level: Overview 


Pattern: A Machine-Learning Library for Cascading, 
Migrating PMML Models to Hadoop EM Hi 

Paco Nathan 

Pattern is an open-source project that takes models trained in 
popular analytics frameworks, such as SAS, R, SPSS, MicroStrat- 
egy, etc., and runs them at scale on Apache Hadoop. This ma¬ 
chine-learning library works by translating PMML—an 
established XML standard for predictive model markup—into 
data workflows based on the Cascading API in Java. 

PMML models can be run in a pre-defined JAR file with no cod¬ 
ing required. PMML can also be combined with other flows based 
on ANSI SQL (Lingual), Scala (Scalding), Clojure (Cascalog), etc. 
Multiple companies have collaborated to implement parallelized 
algorithms: Random Forest, Logistic Regression, SVM, K-Means, 
Hierarchical Clustering, etc., with more machine-learning support 
being added. Benefits include greatly reduceddevelopment costs 
and less licensing at scale while leveraging a combination of 
Apache Hadoop clusters, existing intellectual property in predic¬ 
tive models, and the core competencies of analytics staff. 

Sample code in the class will show apps using predictive mod¬ 
els built in R for anti-fraud classifiers. In addition, examples will 
show how to compare variations of models for large-scale cus¬ 
tomer experiments. Portions of this material come from the book 
"Enterprise Data Workflows with Cascading." 

You will learn how to migrate predictive models to run on 
Hadoop clusters at scale, how to leverage PMML for customer ex¬ 
periments, and how the notion of "ensembles" has enhanced pre¬ 
dictive power: Netflix Prize, Kaggle, KDD, etc. 

Level: Overview 


11:00 am-12:00 pm 

Graph Database Use Cases EM 

Max De Marzi 

Learn from existing open-source projects how to build proof- 
of-concept solutions and how to add a new tool to your develop¬ 
ment toolkit. Social Networks, Recommendation Engines, 
Personalization, Dating Sites, Job Boards, Permission Resolution 
and Access Control, Routing and Pathfinding, and Disambigua¬ 
tion are just a few of the uses cases that lend themselves well to 
graph databases. 

Level: Overview 

HBase Schema Design Done Right EM 

Michael Segel 

Schema design is one of the areas which has often been over¬ 
looked yet can play a critical role when determining overall appli¬ 
cation performance when working with HBase. This class is based 
on practical experience and lessons learned on the importance of 
breaking away from the traditional relational-model approach to 
schema design, and is appropriate to all levels of individuals who 
are looking to improve their knowledge of HBase and of designing 
effective schemas. 

We will cover the following: 

• Tradeoffs between a flattened schema vs. storing complex 
structures in cells 

• The use of column families 

• Different key designs, including hashing, salting and com¬ 
posite keys 

• Secondary Indexing 

It is assumed that you have a basic understanding of the basic 
fundamentals of HBase and Hadoop. 

Level: Intermediate 

Implementing a Simple Mongo Application 

Deep Mistry 

One simple app, three different technologies deployed locally 
to two different clouds. The application is executed, the source is 
examined, the approach is compared, the tools are demonstrated, 
and your questions are answered. 

Level: Overview 

In-Database Predictive Analytics EM 

John A. De Goes 

Predictive analytics have long lived in the domain of statistical 
tools like R. Increasingly, however, as companies struggle to deal 
with exploding volumes of data not easily analyzed by small data 
tools, they are looking at ways of doing predictive analytics di¬ 
rectly inside the primary data store. 

This approach, called in-database predictive analytics, elimi- 
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nates the need to sample data and perform a separate ETL process 
into a statistical tool, which can decrease total cost, improve the 
quality of predictive models, and dramatically shorten develop¬ 
ment time. In this class, you will learn the pros and cons of doing 
in-database predictive analytics, highlights of its limitations, and 
the tools and technologies necessary to head down the path. 

Level: Advanced 

Large-Scale, High-Accuracy Entity Extraction Made Easy 

li'/dVJ 1777773 
Tim Furche 

Big Data is a great opportunity to make smarter decisions. But 
it is also a great challenge, in particular where Big Data comes as 
huge collections of raw text, logs, tweets, etc. Entity and relation 
extraction are crucial components in turning such collections of 
unstructured text into more meaningful, “smart” data. There ex¬ 
ists a plethora of commercial and open-source services or tools 
for extracting entities such as cities, company names, or prices 
from documents. Unfortunately, traditional services have suffered 
from a trifecta of challenges: low coverage, inconsistent accuracy, 
and complex, tool-specific APIs. 

In this class, we will introduce a recent open-source API, 
ROSEAnn, which provides a simple, uniform interface for most of 
the existing extraction services and tools out there. We will walk 
through several scenarios for using ROSEAnn, from detecting 
mentions of a company to more complex cases combining the 
detection of several entity types. In addition to providing a uni¬ 
form interface, ROSEAnn also allows you to easily “scale up” the 
accuracy and coverage of your entity extraction by a smart inte¬ 
gration of an arbitrary number of extraction services. On entity 
types where the underlying services overlap, accuracy is im¬ 
proved (by reconciling the different results); where they don’t 
overlap, coverage is increased. At the end of this class, you will be 
able to deploy automatic entity and relation extraction easily, and 
make use of the integration features of ROSEAnn to achieve entity 
extraction with unparalleled coverage and accuracy. 

Level: Advanced 

The Hadoop Ecosystem: Putting the Pieces Together 

li'/dVJ 17777773 

Jonathan Seidman 

Everybody’s talking about Hadoop and Big Data, and a number 
of companies are undertaking efforts to explore how Hadoop can 
be applied to optimize their data-management and processing 
processes, as well as address challenges with ever-growing data 
volumes. Unfortunately, there’s still a lack of understanding of 
how Hadoop can be leveraged, not to mention how the tools in 
the Hadoop ecosystem can be used together to implement data- 
processing pipelines. 

This class will seek to provide clarity by first discussing some 


typical real-world use cases for Hadoop that are allowing compa¬ 
nies to address challenges and derive tangible value. We’ll then 
dive deeper to discuss specific tools in the Hadoop ecosystem 
such as Hive, Pig, Oozie, Flume, Sqoop and Mahout. More impor¬ 
tantly, we’ll discuss some example architectures to understand 
how these tools can be used together to create processing 
pipelines that implement some of these use cases. Since Hadoop 
isn’t a panacea, we’ll also discuss criteria for determining when 
Hadoop is a suitable fit and when it isn’t, as well as some sugges¬ 
tions for getting started with a Hadoop pilot project. 

Level: Intermediate 

1:45 pm - 2:45 pm 

Analyzing Tweets with HBase, Part I MM EM 

Sameer Farooqui 

This two-hour class will cover how to use the Twitter API to 
download and model tweets in HBase, and then run natural-lan¬ 
guage processing against them. We will first cover the architecture 
fundamentals of HBase, including log-structured merge trees, 
data models, memstores, HFiles and Bloom filters. Next, tweets 
will be populated into HBase. Finally, we will explore some of the 
more interesting analysis that can be done with the tweets and 
NLP. All code for this class is publicly released under a Creative 
Commons license. 

Level: Intermediate 


“There are some great classes covering a wide range of areas, 
from technical to business-related, general priniciples and 
specific technologies. It’s a good value for the cost.” 

— Nicolas Metts, Software Engineer, CableLabs 
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Intro to Machine Learning: A Crash Course, Part I HEM EM 

Paco Nathan 

This two-part, 120-minute class provides a crash-course in¬ 
troduction to Machine Learning. Well start by defining the ter¬ 
minology, making comparisons with the related fields of 
statistical inference and optimization theory, and then review 
some history of ML, fromearly neural nets onward. Well con¬ 
sider a process for feature engineering,with emphasis on using 
tools for data prep and visualization, plus how to grapple with 
dimensional reduction. 

The remainder of the practice will be divided into three parts: 
Representation: a survey of useful algorithms, including proba¬ 
bilistic data structures, text analytics and NLP, plus issues to con¬ 
sider; Evaluation: distinguishing how some methods work better 
for given use cases, including issues of overfitting, bias, etc., and 
the use of quantitative measures; and Optimization: methods for 
improving on a good thing, including how to move from graph 
theory to sparse matrices, ensemble models, plus a look at ML 
competition platforms. Well conclude with suggestions for where 
to continue further studies. 

Prerequisites: some familiarity with programming, probability, 
statistics, linear algebra, and calculus. We will be programming in 
R and Python, along with some bits of Hadoop and Spark. 

Note: This class is part lecture and part hands-on; you are re¬ 
quired to bring a laptop. 

Level: Intermediate 

Introduction to Apache Pig, Part I EM 

Jeffrey Breen 

This two-part class provides an intensive introduction to Pig 
for data transformations. You will learn how to use Pig to manage 
data sets in Hadoop clusters, using an easy-to-learn scripting lan¬ 
guage. The specific topics of the 120-minute class will be cali¬ 
brated to your needs, but we will generally cover: 

• What is Pig and why would I use it? 

• Understanding the basic concepts of data structures in Pig 

• Understanding the basic language constructs in Pig. 

We’ll also create basic Pig scripts. 

Prerequisites: This class will be taught in a Linux environment, 
using the Hive command-line interface (CLI). Please come pre¬ 
pared with the following: 

• Linux shell experience; the ability to log into Linux servers and 
use basic Linux shell (bash) commands is required 

• Basic experience connecting to an Amazon EC2/EMR cluster via SSH 

• Windows users should have a knowledge of Cygwin and Putty 

• A basic knowledge of Vi would be helpful but not necessary 

Also, bring your laptop with the following software installed in 
advance: 

• Putty (Windows only): You will log into a remote cluster for this 


class. Mac OS X and Linux environments include SSH (Secure 
Shell) support. Windows users will need to install Putty. 

• A text editor: An editor suitable for editing source code, such as 
SQL queries. On Windows, WordPad (but not Word) or 
Notepad++ (but not Notepad) are suitable. 

Level: Overview 

Running, Managing and Operating Hadoop at Sears HEM 

Justin Sheppard 

High ETL complexity and costs, data latency and redundancy, 
and batch window limits are just some of the IT challenges caused 
by traditional data warehouses. This presentation covers the use 
cases and technology that enables Sears to solve the problems of 
the traditional enterprise data warehouse approach. Learn how 
Sears significantly minimized data architecture complexity with 
Hadoop, resulting in a reduction of time to insight by 30-70%, and 
discover “quick wins” such as mainframe MIPS reduction. 

What to expect to learn: 

•What is HDFS and MapReduce 

• Traditional Database vs Hadoop 

• Logical and Physical View of Hadoop 

• Hadoop as an enterprise data hub 
Level: Overview 

Understanding MongoDB: New Features Explored 
Through Code HEM EM 

Jonathan Freeman 

This class is geared toward understanding some of the new 
features and enhancements that were released in MongoDB 2.4. 


“The hands-on tutorials were practical and useful. They had real 
examples, which is exactly what I came for and I was not 
disappointed. Also, the Women in Big Data Luncheon alone 
was almost worth the cost of admission.” 

— Naomi Anderson Sr. Software Developer 









Technical Classes 


15 • October 15-17,2013 • San Francisco • www.BigDataTechCon.com 


Wednesday, October 16 


We'll explore these concepts by building an application that uses 
text search, geospatial queries and the aggregation framework. 
The application will be built entirely in JavaScript utilizing 
Node.js and jQuery. However, the emphasis will be on MongoDB, 
so those who are not experts in JavaScript need not worry 
Level: Intermediate 

Untangling the Relationship Hairball with 
a Graph Database EM 

Max De Marzi 

Not only has data gotten bigger, it’s gotten more connected. 
Make sense of it all and discover what these Big Data connections 
can tell you about your users and your business. Come to this 
class to learn some of the different use cases for graph databases, 
and how to spot the non-obvious opportunities in your data. 

Level: Overview 

Using Hadoop to Lower the Cost of Data Warehousing: 

The Paradigm Shift Underway ESI 

Dave Jespersen 

Data warehouses are bursting from increased data volume, 
and new sources of data are making traditional approaches to 
data analysis costly and slow. Typically, analysts define the prob¬ 
lem, identify data samples and pull the data through an ETL (ex¬ 
tract, transform and load) process. But now, Hadoop is changing 
the data-warehousing landscape by improving data archiving and 
lowering costs by offloading data warehouse processing. A 
Hadoop platform enables companies to easily scale as the vol¬ 
ume, velocity and variety of data continues to increase while pro¬ 
viding even higher-quality results. 

This class will cover the operational cost of deploying Hadoop 
relative to more traditional data-warehousing implementations. 
We will cover real-world customer use cases and demonstrate 
how dramatic cost savings (often a magnitude of savings) were 
achieved through properly deployed Hadoop implementations. 
Level: Overview 

3:15 pm-4:15 pm 

Analyzing Tweets with HBase, Part II HEM EM 

Sameer Farooqui 

This two-hour class will cover how to use the Twitter API to 
download and model tweets in HBase, and then run natural-lan¬ 
guage processing against them. We will first cover the architecture 
fundamentals of HBase, including log-structured merge trees, data 
models, memstores, HFiles and Bloom filters. Next, tweets will be 
populated into HBase. Finally, we will explore some of the more in¬ 
teresting analysis that can be done with the tweets and NFP All code 
for this class is publicly released under a Creative Commons license. 
Level: Intermediate 


Data Modeling and Relational Analysis in a NoSQL World 


Most Popular! 


Michael Miller 

The new wave of NoSQF technology is built to provide the flex¬ 
ibility and scalability required by agile Web, mobile and enter¬ 
prise applications. Interestingly, any system that supports 
chained Map/Reduce processing (specifically Map/Reduce Map) 
fulfills the basic query requirements of a SQF engine. Therefore, 
we will work to help you bridge the gap between SQF, relational 
(big) data, and the brave new world of NoSQF. 

In this class, you will learn how to model real-world relational 
data in a modern document database. We next go on to compile 
various SQF operations (SEFECT, SUM, AVG, JOIN, etc.) into ex¬ 
ceptionally simple Map/Reduce programs. We finish with a study 
demonstrating the performance, scalability and “time-to-value” 
benefits of this approach, specifically the pre-computation of ma¬ 
terialized views. The class will be a mix of chalkboard and interac¬ 
tive demonstrations. 

Prerequisites: Bring a laptop with a modern browser (Chrome, 
Safari or Firefox). Previous experience with basic scripting lan¬ 
guages (e.g., JavaScript) is an advantage but not a requirement. 
All data and code samples will be provided at the beginning of 
the class. 

Level: Advanced 

Intro to Machine Learning: A Crash Course, Part II m EM 

Paco Nathan 

This two-part, 120-minute class provides a crash-course in¬ 
troduction to Machine beaming. Well start by defining the ter¬ 
minology, making comparisons with the related fields of 
statistical inference and optimization theory, and then review 
some history of ME, fromearly neural nets onward. Well con- 


“Great for high-level learning!” 

—Carol Long, Executive Acquisitions Editor, Wiley Publishing 







16 • October 15-17,2013 • San Francisco • www.BigDataTechCon.com 


Technical Classes 


Wednesday, October 16 


sider a process for feature engineering,with emphasis on using 
tools for data prep and visualization, plus how to grapple with 
dimensional reduction. 

The remainder of the practice will be divided into three parts: 
Representation: a survey of useful algorithms, including proba¬ 
bilistic data structures, text analytics and NLP, plus issues to con¬ 
sider; Evaluation: distinguishing how some methods work better 
for given use cases, including issues of overfitting, bias, etc., and 
the use of quantitative measures; and Optimization: methods for 
improving on a good thing, including how to move from graph 
theory to sparse matrices, ensemble models, plus a look at ML 
competition platforms. We’ll conclude with suggestions for where 
to continue further studies. 

Prerequisites: some familiarity with programming, probability, 
statistics, linear algebra, and calculus. We will be programming in 
R and Python, along with some bits of Hadoop and Spark. 

Note: This class is part lecture and part hands-on; you are re¬ 
quired to bring a laptop. 

Level: Intermediate 

Introduction to Apache Pig, Part II EM 

Jeffrey Breen 

This two-part class provides an intensive introduction to Pig 
for data transformations. You will learn how to use Pig to manage 
data sets in Hadoop clusters, using an easy-to-learn scripting lan¬ 
guage. The specific topics of the 120-minute class will be cali¬ 
brated to your needs, but we will generally cover: 

• What is Pig and why would I use it? 

• Understanding the basic concepts of data structures in Pig 

• Understanding the basic language constructs in Pig. 

We’ll also create basic Pig scripts. 

Prerequisites: This class will be taught in a Linux environment, 
using the Hive command-line interface (CLI). Please come pre¬ 
pared with the following: 

• Linux shell experience; the ability to log into Linux servers and 
use basic Linux shell (bash) commands is required 

• Basic experience connecting to an Amazon EC2/EMR cluster via SSH 

• Windows users should have a knowledge of Cygwin and Putty 

• A basic knowledge ofVi would be helpful but not necessary 

Also, bring your laptop with the following software installed in 
advance: 

• Putty (Windows only): You will log into a remote cluster for this 
class. Mac OS X and Linux environments include SSH (Secure 
Shell) support. Windows users will need to install Putty. 

• A text editor: An editor suitable for editing source code, such as 
SQL queries. On Windows, WordPad (but not Word) or 
Notepad++ (but not Notepad) are suitable. 

Level: Overview 


Running Mission-Critical Applications on Hadoop 

Dave Jespersen 

This class will look at what is involved when you move Hadoop 
from a lab environment to actual deployment in production. We 
will cover the critical enterprise-grade features like data integra¬ 
tion, data protection, business continuity and high availability, 
and discuss the ways you can accomplish these in your environ¬ 
ment. We will also identify potential stumbling blocks, identify 
what a platform can or can’t provide, and help determine the 
scope and level of customization necessary to make your deploy¬ 
ment successful. 

At the end of the class, you will better understand how to move 
Hadoop from the test bed to production deployment, what is in¬ 
volved in the process, and how to run a mission-critical Hadoop en¬ 
vironment. Where appropriate, there will be real-world examples. 

Level: Intermediate 

Staying Alive: Ensuring Service Availability in Hadoop EM 

Vini thr a Var adhar aj an 

All sorts of things can go wrong in your data center. Ensuring 
that your Hadoop systems stay up through various types of 
threats—from node failures to site failures—is vital toward meet¬ 
ing SLAs and ensuring a high quality of experience for clients of 
your production-level distributed system. We will discuss the vari¬ 
ous threat models that need to be handled, and the elements of 
how to build highly available architectures for key system services. 

In this class, we will shed light on the complexity of what it 
takes to keep different system services alive, starting from core 
Hadoop services of HDFS and Map/Reduce, and extending to 
higher-level applications such as Hive. This understanding will 


“The conference has good content and selection of speakers, 
and is well organized in general.” 

—Volker Schulz, VP of Engineering, Idea5 
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help you evaluate the risk profile and cost of providing different 
SLAs for your entire Hadoop system. In this class you will learn 
about: 

• Worker node versus master node failure characteristics and tol¬ 
erance levels of various services in the Hadoop ecosystem 

• Vital components of each service as they pertain to availability 
(metadata, data, databases, ZooKeeper quorum, etc.) 

• What it takes to set up backup nodes versus backup clusters 
Level: Advanced 

Storm: Real-Time Data Processing at a Massive Scale 

turn]* 

Jill Jacobucci 

Using the stream processing and Hadoop ecosystem, Sears is 
developing a platform to collect, process and publish data in real¬ 
time. By integrating Storm into the enterprise data flow and de¬ 
ploying scalable topologies that source from multiple data 
sources, this platform makes it possible to process hundreds of 
thousands of messages per second. This processing is essential 
for many Sears real-time initiatives, including gamification, indi¬ 
vidualized customer offers, inventory management, sales metrics 
and fraud detection. 

What to expect to learn: 

• Overview of real-time computation 

• Components and operational features of Storm 

• Proposed real-time processing platform 
Level: Intermediate 

8:45 am - 9:45 am 

Building an Impenetrable ZooKeeper EM 

Kathleen Ting 

Apache ZooKeeper is a project that provides reliable and 
timely coordination of processes. Given the many cluster re¬ 
sources leveraged by distributed ZooKeeper, it’s frequently the 
first to notice issues affecting cluster health, which explains its 
moniker: “The canary in the Hadoop coal mine.” 

Come to this class and you will learn: 

• How to configure ZooKeeper reliably 

• How to monitor ZooKeeper closely 

• How to resolve ZooKeeper errors efficiently 

Culling from the diverse environments weVe supported, we will 
share what it takes to set up an impenetrable ZooKeeper environ¬ 
ment, what parts of your infrastructure specifically to monitor, 
and which ZooKeeper errors and alerts indicate something seri¬ 
ously amiss with your hardware, network or HBase configuration. 
Level: Intermediate 


Data Modeling for Chat Messages with Cassandra HEM EM 

Ameet Chaubal 

Cassandra excels at many aspects of database design, such as 
fast ingestion, replication and distributed architecture. However, 
there are some other features of databases and modeling at which 
Cassandra shines and lends itself to interesting use cases. This 
class will focus on some of these features, specifically automatic 
ordering and automatic grouping of related data elements. 

In this hands-on class, we will explore using Cassandra for a 
chat-message storage/retrieval application. The class will set up a 
Cassandra instance in a VM, and demonstrate Schema design 
using Cassandra CLI and interaction with it using a Java client. 
You will gain an appreciation for Cassandra, its application design 
in Java, and its data-modeling intricacies. 

Prior experience in Java will be helpful; however, even without 
it, watching the demonstration will allow you to gain from the ex¬ 
ercises. 

Please note: You will need a laptop with VMware, CentOS and 
Eclipse installed. 

Level: Intermediate 

First Steps to Big Data from MySQL HEM 

Dave Stokes 

MySQL is the ubiquitous database on the Web, and just about 
every organization has a copy running someplace. But how do 
you get your information in your MySQL instances into a Big Data 
store? This class covers basic data warehousing that can be done 
with the community edition, column storage engines, and finally 
moving into Hadoop (more than 80% of all Hadoop sites feed data 
from MySQL). So if you need to plunge into deep data but need 
some guidance, please attend this class. 

Level: Overview 


There is very little vendor pitch and something for everyone. 

— Mani Sivagnanam, Sr. Manager, Marketing Systems, Staples 
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Hadoop by Example MB 

Serge Blazhievsky 

This class is designed to demonstrate the most commonly 
used Map/Reduce design patterns for various problems. Perform¬ 
ance and scalability will be taken into consideration. 

The class will present a general overview of the problems that 
can be solved using Map/Reduce, scalability and performance 
tuning for clusters of different sizes. The techniques described 
here can be used on all Hadoop distributions. 

The following technical problems will be covered: 

• “Hello world!” of the Map/Reduce universe—a word count ex¬ 
ample 

• Mapping only Map/Reduce jobs and their usage for ETL-type jobs 

• Global sorting techniques 

• Sequencing files and its usage in Map/Reduce jobs 

• Mapping files and its usage in Map/Reduce jobs 

• Reduce-side join and its advantages and limitations 

• Map-side join and its advantages and limitations 

Each technique will be provided with a code example that can 
be used as a template. No prior knowledge about the topic is re¬ 
quired; however, some Java knowledge is recommended. 

Level: Intermediate 

Managing a World of Data: Geospatial Best Practices 

I MVi fifiJiR 
Norman Barker 

Everything happens somewhere, and organizations are realiz¬ 
ing how important a role location, surroundings and time play in 
decision-making, including: 

• Which driving route is least congested? 

• Where’s the closest top-rated French restaurant? 

• Are field equipment malfunctions related to altitude or time of 
day? 

• What people will be in a particular location given their current 
activity? 

The challenge is how to store, index and query the massive 
quantities of spatial and temporal data generated via mobile 
phones, cameras, computers, sensors and Internet-enabled ap¬ 
pliances. This class will present the best practices emerging from 
the field of geospatial data and the specialized database systems 
developed to manage it. We will cover: 

• Apps that inspire: novel ways geospatial data is being applied in 
government and industry 

• Geospatial data management standards such as GeoJSON and 
GML; which should you follow? 

• Geospatial basics: bounding box and proximity searches 

• Advanced geospatial operations, including storing complex 
geometries, temporal and geo-metadata; performing bounding 
polygon and radius, intersections, and buffering; and best prac¬ 
tices for scaling and partitioning geospatial data 


• Comparison of specific SQL and NoSQL geo-indexing libraries 

Attend this class to learn how to best store, index and query the 
spatial and temporal data generated by mobile devices, sensor 
networks, and Internet-enabled appliances. Fundamentals on 
geospatial data-management standards (GeoJSON, GML), 
bounding box and proximity searches, and the storage of geome¬ 
tries and geo-metadata will also be covered. Databases will in¬ 
clude PostGIS and CouchDB (running on Cloudant). 

Level: Overview 

Proper Care and Feeding of HBase Coprocessors EM 

Michael Segel 

Coprocessors are a relatively new feature within HBase. While 
they are capable of providing useful and powerful performance 
improvements, if they are not designed properly, they can have 
extreme detrimental effects. Like the Tribbles of “Star Trek” or the 
Gremlins, one must use extreme caution and follow good prac¬ 
tices, or else bad things can happen... 

This class provides an introduction to Coprocessors, focusing 
on the potential problems that can arise from poor design as well 
as issues with the current implementation. This class is geared to¬ 
ward the more experienced HBase users and will provide some 
common examples of how Coprocessors are being used today, as 
well as some potential future use cases. 

Level: Intermediate 

11:30 am-12:30 pm 

A/B Testing in a Big Data Environment ESI 

Gabor Melli 

Big Data has allowed organizations to make significantly better 
evidence-based decisions than ever before. However, with so much 
information, a new challenge for us is to rank all the possible ac¬ 
tionable hypotheses. There is good news on this front because 
many organizations have shown that it is possible to introduce 
controlled (A/B) experiments into business intelligence processes. 

But enhancing your organization’s business intelligence 
process maturity level to include A/B-based testing, particularly 
in a Big Data environment, requires new underlying technical 
and business capabilities - and cultural shifts. This class reviews 
the foundations of A/B testing and presents several case studies 
of successful and failed applications of A/B testing in a Big Data 
setting. This will help you apply sound A/B testing into your data- 
driven organization. Attend this class to learn: 

• How other organizations are utilizing A/B testing 

• Best-practices for applying A/B testing 

• Where to place your organization in an A/B testing inclusive 
maturity model 

• Methods to introduce A/B testing into your environment 

Level: Overview 
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Building Your Own Facebook Graph Search with Cypher 
and Neo4j EM 

Max De Marzi 

Learn how to create your own Facebook Graph Search or how 
to build a similar system with your own company data. Also learn 
how to interpret natural language into a grammar and use it to 
build Cypher queries to retrieve data from a graph. Knowledge of 
Natural Language Processing not required. The instructor works 
for Neo Technology, creators of the Neo4j Graph Database. 

Level: Intermediate 

Introduction to Parallel Iterative Machine-Learning 
Algorithms on Hadoop’s Next-Generation 
YARN Framework WM IB 

Josh Patterson 

Online learning techniques, such as Stochastic Gradient De¬ 
scent (SGD), are powerful when applied to risk minimization and 
convex games on large problems. However, their sequential de¬ 
sign prevents them from taking advantage of newer distributed 
frameworks such as Hadoop Map/Reduce. In this class, we will 
take a look at how we parallelize parameter estimation for linear 
models on the next-gen YARN framework Iterative Reduce and 
the parallel machine-learning library Metronome. 

Level: Advanced 

Making it Real: Leveraging Big Data to Solve Big Problems 
||^i ESIS 

Siva Vaidyanatha 

There is a lot of hype around Big Data and its myriad possibili¬ 
ties. This class makes it real and talks about concrete ways in 
which Big Data is used today. This is an advanced class that dives 
into architecture and design details. For illustration purposes, it 
uses case studies from the retail and life science industries. This 
class provides an overview of: 

• The current state of technology: predictive analytics and Big Data 

• An overview of a common Big Data stack: the Hadoop ecosystem 

• “Beneath the covers” explanation of how Map/Reduce enhances 
predictive analytics 

• Data collection and munging 

• Model creation 

• Visualization and interpretation 

• Scaling for very large data sets 

• Case studies: deep-dive (includes architecture, design and sam¬ 
ple code) 

• Understanding the consumer “genome”: leveraging predictive 
analytics and machine learning for extreme omni-channel per¬ 
sonalization by understanding consumer behavior 

• Drug repurposing in life sciences: computational methods used 
in the screening phases of drug discovery and drug design. 

Level: Advanced 


Most Popular! 


Real-Time Hadoop EZai 

Michael Segel 

We will start with a short introduction to different approaches 
to using Hadoop in the real-time environment, including real¬ 
time queries, streaming, and real-time data processing and deliv¬ 
ery. Then we’ll describe the most common use cases for real-time 
queries and products implementing these capabilities. We will 
also describe the role of streaming, common use cases, and prod¬ 
ucts in the space. 

The majority of time will be dedicated to the usage of HBase as 
a foundation for the real-time data process. We describe several 
architectures for such implementation, and a high-level design 
and implementation for two examples: system for storing and re¬ 
trieving images, and using HBase as a back end for Lucene. 

Level: Intermediate 


Seven Deadly Hadoop Misconfigurations EHM 

Kathleen Ting 

Misconfigurations and bugs break the most Hadoop clusters. 
Fixing misconfigurations is up to you! 

Attend this class to learn how to get your Hadoop configuration 
right the first time. In some support contexts, a handful of common 
issues account for a large fraction of issues. That is not the case for 
Hadoop, where even the most common specific issues account for 
no more than 2% of support cases. Hadoop errors show up far from 
where you configured, making it hard to know what log files to ana¬ 
lyze. It pays to be proactive. Come to this class! 

Level: Intermediate 


“Big Data TechCon is great for beginners as well as 
advanced Big Data practitioners. It’s a great conference!” 

— Ryan Wood, Software Systems Analyst, Government of Canada 
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Time + Space + Transaction: Multidimensional Geospatial 
Analysis with Hadoop ESI MS 

Dan Rosanova 

Most organizations do a good job of reporting on the past, 
some do a good job of estimating the future, and few have a total 
understanding of their environment, customers, or business 
based on multidimensional analytics that account for time, space 
and transactions. This class will tie these three areas of analytics 
together to teach attendees how to use commonly available infor¬ 
mation and tools to achieve deep insights. 

The arrival of Big Data blurs the boundaries between report¬ 
ing, software development and analytics. This deeply technical 
class will walk through moving beyond the most common sources 
of transactional reporting and temporal analytics to include 
geospatial dimensions that can unlock the true potential of our 
data. By examining CRM (transactional), Web logs (temporal) and 
IP Geolocation (spatial), this deep-dive will walk developers and 
data scientists through a real example of identifying trends across 
transactions, time and space to discover deeper insights and ac¬ 
tionable intelligence. 

From data procurement, transformation, loading and analysis, 
to display and interpretation, this class will teach attendees how 
to dig deeper into their data by reaching across paradigms and 
using machine-learning algorithms and rich presentation plat¬ 
forms to explore and visualize Big Data. 

Level: Advanced 

1:45 pm - 2:45 pm 

Building Applications That Predict User Behavior Through 
Big Data Using Open-Source Technologies WM MS 

Simon Chan 

One of the biggest challenges for data engineers building real- 
world predictive applications with Big Data is the steep learning 
curve of multiple data-processing frameworks, learning algo¬ 
rithms and scalable programming. 

In this class, you will get hands-on instructions for data engi¬ 
neers to add predictive features, such as personalization, rec¬ 
ommendation and content discovery, to your applications 
using Big Data. The class will begin with a brief overview of 
scalable machine learning for Big Data. You will then see 
demonstrations with the use of open-source technologies such 
as Hadoop, Cascading, Scalding and PredictionlO with live 
sample codes. A number of collaborative filtering algorithms 
will be explained. 

You will also see the use of open-source user-friendly control 
interfaces to evaluate, compare, select and deploy learning algo¬ 
rithms; tune hyperparameters of algorithms manually or auto¬ 
matically; and review the predictive model training status. By the 
end of the class, you will master the core concepts of Machine 


Learning and be able to apply scalable algorithms into real soft¬ 
ware production environment. 

Level: Advanced 

Getting Started with R and Hadoop, Part I EH 

Jeffrey Breen 

Increasingly viewed as the lingua franca of statistics, R is a nat¬ 
ural choice for many data scientists seeking to perform Big Data 
analytics. And with Hadoop Streaming, the formerly Java-only Big 
Data system is now open to nearly any programming or scripting 
language. This two-part class will teach you options for working 
with Hadoop and R before focusing on the RMR package from the 
RHadoop project. We will cover the basics of downloading and in¬ 
stalling RMR, and we will test our installation and demonstrate its 
use by walking through three examples in depth. 

You will learn the basics of applying the Map/Reduce para¬ 
digm to your analysis, and how to write mappers, reducers and 
combiners using R. We will submit jobs to the Hadoop cluster and 
retrieve results from the HDFS. We will explore the interaction of 
the Hadoop infrastructure with your code by tracing the input 
and output data for each step. Examples will include the canoni¬ 
cal “word count” example, as well as the analysis of structured 
data from the airline industry. 

No specific prerequisite knowledge is required, but a familiar¬ 
ity with R and Hadoop or Map/Reduce is helpful. 

Level: Advanced 

Hadoop Backup and Disaster Recovery 101 EH 

Jairam Ranganathan 

Any production-level implementation of Hadoop must have its 
data protected from threats. Threats to data integrity can be 
human-generated (malicious/unintentional) or site-level (power 
outage, flood, etc.). As soon as you start to identify these threats, 


“A great learning experience.” 

—Schalk van der Merwe, CEO, RCS Group 
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it’s important to develop a backup or disaster-recovery solution 
for Hadoop! 

In this class, you will learn the unique considerations for 
Hadoop backup and disaster recovery, as well as how to navigate 
the common issues that arise when architects and developers 
look to protect the data. 

We'll cover: 

• How to model your backup/disaster-recovery solution, consid¬ 
ering your threat model and specifics around data integrity, 
business continuity, and load balancing. 

• Best practices and recommendations, highlighting Hadoop in 
contrast to traditional SAN/DB systems; replication versus 
“teeing" models for ensuring DR; replication scheduling; Hive; 
HBase; managing bandwidth; monitoring replication; using 
one's secondary beyond replication; and a survey of existing 
tools and products that can be used for backup and DR 

After taking this class, you should be able to explain to your 
organization the right way to effect a backup or data recovery 
solution for Hadoop. 

Level: Intermediate 

Simple Yet Efficient Web Extraction with OXPath, Part I 

I IhM 1777773 
Tim Furche 

Big Data has already changed how we make decisions, whether 
on pricing, recommendations or investment. However, access to 
such Big Data is often expensive or limited to large organizations 
that collect it. Though much is available on the Web, it is often 
only available through Web Forms and HTML pages. 

In this two-part class, we give a thorough overview of large- 
scale data extraction from the Web, as well as its challenges. The 
first part gives an overview of existing tools and walks through 
real-life examples of manual wrappers. In the second part, we 
delve deeper into data extraction, and discuss common patterns 
and fallacies when creating, maintaining, and running large-scale 
data-extraction systems. 

In the first part, we start with an overview of traditional ap¬ 
proaches, outlining their strengths and limitations to enable at¬ 
tendees to more easily decide what tools are most appropriate for 
their needs. We walk through real-life examples for manual wrap¬ 
per creation with XPath and WebDriver, the emerging W3C stan¬ 
dard for programmatic browser control. 

Finally, we show that extracting Big Data from the Web doesn't 
have to be hard or costly. We will show you how to extract data 
with just a little knowledge of XPath. That's all you need to get 
started with OXPath, a high-level, high-performance extension of 
XPath for efficient data extraction from any website. OXPath ex¬ 
tends XPath with four well-defined extensions, including the abil¬ 
ity to simulate user actions and to select elements of a Web page 
through their appearance. This allows for easy navigation through 


“This confernce is very, very well managed.” 

— Rahul Joglekar, Architect 



complex Web applications, and reduces maintenance in the face 
of structural page changes. 

Level: Intermediate 


Windows Azure HDInsight and PowerPivot: Cloud-Based 
Data Analysis with Familiar and Friendly Tools ESI MS 

Dan Rosanova 

This class will explore the features of Windows Azure HDIn¬ 
sight and Excel PowerPivot, a powerful combination of a cloud- 
based Big Data platform and an Excel front end that allows data 
scientists and analysts to explore semi-structured data in the fa¬ 
miliar Excel tools that many are already experienced with. From 
provisioning and loading data, to structuring for Hive queries and 
connecting to with Excel's ODBC driver for Hive, this session will 
be a hands-on walkthrough of leading Hadoop tools on the Win¬ 
dows platform. Also included will be data exploration using Excel 
tools, and particularly PowerPivot to visualize and explore data in 
a graphical Excel environment. 

The cloud-based experience of Windows Azure HDInsight al¬ 
lows for a pay-as-you-go model for processing data on 100% 
Apache Hadoop-compatible clusters with zero configuration 
time. This class will be hands-on and require Excel 2013 and ac¬ 
cess to an HDInsight (or other Apache Hadoop compatible) in¬ 
stallation (including HDInsight for Windows). Detailed software 
requirements will be sent with the presentation ahead of time, 
and you are expected to familiar with SQL, Hadoop and Excel. Ac¬ 
tive participation is strongly encouraged but not required, as is 
working in pairs. 

Level: Advanced 
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Thursday, October 17 


3:30 pm - 4:30 pm 

Getting Started with R and Hadoop, Part II EM 

Jeffrey Breen 

Increasingly viewed as the lingua franca of statistics, R is a nat¬ 
ural choice for many data scientists seeking to perform Big Data 
analytics. And with Hadoop Streaming, the formerly Java-only Big 
Data system is now open to nearly any programming or scripting 
language. This two-part class will teach you options for working 
with Hadoop and R before focusing on the RMR package from the 
RHadoop project. We will cover the basics of downloading and in¬ 
stalling RMR, and we will test our installation and demonstrate its 
use by walking through three examples in depth. 

You will learn the basics of applying the Map/Reduce para¬ 
digm to your analysis, and how to write mappers, reducers and 
combiners using R. We will submit jobs to the Hadoop cluster and 
retrieve results from the HDFS. We will explore the interaction of 
the Hadoop infrastructure with your code by tracing the input 
and output data for each step. Examples will include the canoni¬ 
cal “word count” example, as well as the analysis of structured 
data from the airline industry. 

No specific prerequisite knowledge is required, but a familiar¬ 
ity with R and Hadoop or Map/Reduce is helpful. 

Level: Advanced 


Selecting the Right Big Data Tool for the Right Job, and 
Making It Work for You EM I 


Most Popular! 


Eddie Satterly 

This class will focus on the various types of Big Data solu¬ 
tions—from open-source to commercial solutions—and the spe¬ 
cific selection criteria and profiles of each. As in all technology 
areas, each solution has its own sweet spots and challenges either 
in CAP theorem, ACID compliance, performance or scalability. 
This class will provide an overview of the technical tradeoffs for 
the list of solutions in technical terminology. Once the technical 
tradeoffs are reviewed, we will review the cost and value of open- 
source solutions versus commercial software, and the trade-offs 
that folks must take to choose one over the other. 

The next phase will go into great detail on use cases for specific 
solutions based on real-world experience. All of the specific use 
cases have been seen first-hand from our customers. The solutions 
will be reviewed to the level of specific technical architecture and 
deployment details. This is intended for a highly technical audience 
and will not provide any high-level material on the solutions dis¬ 
cussed. It is assumed you have working knowledge of solutions such 
as SQL, NoSQL, distributed file systems and time-series indexes. 

You should also have an understanding of CAP theorem, ACID com¬ 
pliance and general performance characteristics of systems. 

Level: Advanced 


Hadoop Design Patterns EM B a 

Serge Blazhievsky 

This class is designed to demonstrate how to solve the most 
common problems with Map/Reduce technologies, and to opti¬ 
mize your Map/Reduce jobs to run efficiently on a given Hadoop 
cluster. The differences among types of joins will be described 
with real code examples. After this class, you will understand: 

• Different types of joins in Map/Reduce 

• ETL design patterns 

• Sort and secondary sort 

• Data-driven design patterns 
Level: Intermediate 

Large-Scale Distributed Queries with Apache Drill EM EM 

Jacques Nadeau 

This class will discuss the architecture behind full ANSI-SQL 
large-query solution for Big Data. Apache Drill, like Hadoop, was 
inspired by a Google white paper; is architected to work with 
nested data structures such as JSON; and can process queries 
against a variety of databases, including MongoDB, HBase and 
Oracle. The class will give an overview of the use cases and pres¬ 
ent the design of some of the critical architectural components. 
Level: Advanced 


Simple yet Efficient Web Extraction with OXPath, Part II 

QZH9 ULLid 
Tim Furche 

In the second part of this class, we will look at more complex 
wrappers, as well as the maintenance and management of large- 
scale extraction infrastructure. Well walk you through several ex¬ 
amples on how to create wrappers, driven by real use cases from 
finance and competitive pricing. For these examples, well use the 
OXPath Firefox IDE, which allows for the development of OXPath 
wrappers using familiar Firefox developer tools. We will discuss 
how to make wrappers robust and maintainable through a small 
set of wrapper design patterns. OXPath’s open-source engine is 
able to deal with many of the issues that make Web scraping a 
pain, from buffer management to auto-complete fields. 

However, we will also show the limits of the engine and how to 
deal with them. We will conclude the presentation with best prac¬ 
tices for deploying and scheduling the resulting wrappers, e.g., for 
repeated extraction to keep extracted data up to date. 

Level: Advanced 






BigData 

— TECHCON 


23 • October 15-17,2013 • San Francisco • www.BigDataTechCon.com 


Faculty 


Norman Barker 

— j Norman is a specialist in developing geospatial data dis¬ 
covery and dissemination products. He has spent his ca- 
reer managing and developing geospatial products. He is 
also an open-source developer with contributions to 
MapServer, GDAL and PostGIS. Norman holds a Master’s in mathemat¬ 
ics from the University of Durham, England. He is an avid rugby player. 



Ameet Chaubal 

Ameet works in the Emerging Technology Innovation 
Sa> group at Accenture, and has been architecting, developing 

^ and implementing solutions leveraging distributed sys- 
'4. ju terns for Fortune 500 clients. He has kick-started the Big 
Data training academy at Accenture, and recently spoke at NYC Cas¬ 
sandra Tech Day. 


Serge Blazhievsky 

Serge is Principal Software Engineer at Nice Systems, and is 
an experienced developer and architect with a rich back¬ 
ground in C++/Java and distributed systems. Nice Systems 
uses Hadoop infrastructure for various data-processing 
needs. His previous company used Hadoop infrastructure for all report¬ 
ing needs. Before that, Serge designed Hadoop infrastructure used for 
Internet crawling and Web-page analysis. Serge holds a Master’s Degree 
in Computer Engineering from Santa Clara University. Serge is a regular 
contributor to various Hadoop conferences, including the Hadoop User 
Group at Yahoo, the creator of Hadoop. 

Dipti Borkar 

Dipti is Director of Product Management at Couchbase, 
where she is responsible for the company’s flagship prod¬ 
uct, Couchbase Server, and works with customers and 
users to understand emerging requirements for low-la¬ 
tency, scalable data stores. Dipti has deep technical experience in the 
database industry, having worked at IBM as a software engineer and 
Development Manager for the DB2 server team, and then at MarkLogic 
as a Senior Product Manager. 

Jeffrey Breen 

Jeffrey is the Principal of the Think Big Academy at Think 
Big Analytics. Jeffrey has been very active in local user 
groups, has taught and mentored throughout his career, 
and has presented talks recently on R and Hadoop to the 
Data Warehouse Institute, the Chicago Area Hadoop and R User groups, 
and the Boston Predictive Analytics Meetup. Jeffrey has also developed 
and delivered the RHadoop training course, as well as all materials for 
Revolution Analytics. 

Simon Chan 

Simon is a cofounder and product lead of PredictionlO, an 
open-source Machine Learning Server that empowers pro¬ 
grammers and data engineers to build smart applications. 
PredictionlO itself is built on top of solid open-source 
technology, such as Scala, Hadoop, Mahout, Cascading and Scalding. 
Starting off as a software engineer after graduating from university, 
Simon founded three tech startups in the past JO years, in the Bay Area, 
in Hong Kong and in Mainland China. He specializes in machine learn¬ 
ing and recommendation technology, with a strong interest in social 
applications. Simon is a Ph.D. candidate in Machine Learning at Uni¬ 
versity College London, and is a frequent speaker in the Data Science 
community. 






Ben Coverston 

Ben currently helps coordinate the training and support 
activities at DataStax. He has more than 15 years of devel¬ 
opment experience, and has written code running on some 
of the largest travel websites in the world. He became inter¬ 
ested in Big Data through his experiences in troubleshooting data-re- 
lated problems in which the velocity and volume of data exceeded the 
capabilities of a single machine. 

® Doug Cutting 

Doug is the creator of numerous successful open-source 
projects, including Lucene, Nutch and Hadoop. Doug joined 
Cloudera in 2009 from Yahoo, where he was a key member 
: of the team that built and deployed a production Hadoop 
storage and analysis cluster for mission-critical business analytics. Doug 
holds a Bachelor’s degree from Stanford University and sits on the Board 
(and is currently chairman) of the Apache Software Foundation. 




John A. De Goes 

John is CEO and CTO of Precog, and is responsible for lead¬ 
ing the design and development of the company’s data¬ 
warehousing and analysis platform. He has been working 
professionally in distributed systems design and develop¬ 
ment for more than a decade. 

Author of multiple best-selling technical books, and a major con¬ 
tributor to open source, John has an extensive background in scientific 
and distributed computing, and in large-scale analytics. John is a fre¬ 
quent and well-received speaker at industry events. Recent engage¬ 
ments include DataWeek Conference, Glue Conference, Frontier 
Developers, and NEScala. 


Max De Marzi 

Max is a Software Field Engineer at Neo Technology, where 
he built the Neography Ruby Gem, a REST API wrapper to 
the Neo4j Graph Database. He is addicted to learning new 
things, taking on a challenge and finding (and sharing) 
pragmatic solutions. 




Sameer Farooqui 

Sameer is a freelance Big Data consultant and trainer, spe¬ 
cializing in Hadoop and Cassandra. For the past five years, 
he has deployed various clustering software packages in¬ 
ternationally to clients, including Fortune 500 companies, 
governments, hospitals and banks. Most recently, he was a Systems Ar¬ 
chitect at Hortonworks, where he specialized in designing Hadoop pro¬ 
totypes and Proof-of-Concept use cases. Previously, Sameer worked at 
Accenture's Silicon Valley R&D lab, where he was responsible for study¬ 
ing NoSQL databases, Cloud Computing and Map/Reduce for their 
commercial applicability to emerging Big Data problems. At Accenture 
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Tech Labs, Sameer was the lead engineer for creating a 32-node proto¬ 
type using Cassandra and Amazon Cloud Computing to host 10TB of 
Smart Grid data. He also worked on a more than 30-person team in the 
design phase of a multi-environment Hadoop cluster pilot project at 
NetApp. Before Hortonworks and Accenture, Sameer spent five years at 
Symantec, where he deployed VERITAS Clustering and Storage Founda¬ 
tion solutions (VCS, VVR, SF-HA) to Fortune 500 and government 
clients throughout North America. 

John Foreman 

^ ' John is the Chief Data Scientist for MailChimp.com. He 
holds a graduate degree in Operations Research from MIT 
t 7 and has worked as an analytics consultant for the Depart- 
j \ ment of Defense, Coca-Cola, Royal Caribbean Interna¬ 
tional, and Intercontinental Hotels Group. His expertise is in 
optimization modeling, revenue management and predictive modeling. 

For fun, he teaches analytics concepts through narrative fiction at 
Analytics Made Skeezy. 

Jonathan Freeman 

Jonathan is a Developer and Tech Evangelist for Open Soft¬ 
ware Integrators. He's a JavaScript specialist, Big Data and 
NoSQL enthusiast, writer, speaker and jazz musician. You 
can find his articles and blog posts on both the Open Soft¬ 
ware Integrators website and as a guest writer on Info World's Strategic 
Developer blog. 




ft a 


Tim Furche 

Tim heads the DIADEM lab at Oxford University. He is also 
a fellow at the Oxford-Man Institute for Quantitative Fi¬ 
nance, where he investigates the applications of Big Data 
extracted from the Web for predicting financial indicators, 
funded by the Man group. His research interests include data extrac¬ 
tion, XML and semi-structured data, in particular query evaluation and 
optimization, and advanced Web information systems. He has au¬ 
thored more than 50 peer-reviewed scientific publications, some of 
them cited more than 200 times. His main contributions are on XPath 
optimization and evaluation, on linear time and space querying of 
large graphs, and on languages for Web data extraction, querying, and 
search. From 2004 to 2008, he co-coordinated the working group on 
“Reasoning-aware Querying" in the EU Network of Excellence REW- 
ERSE at the Ludwig Maximilian University of Munich. 

Tim has extensive experience as a lecturer. He has given several lec¬ 
tures and hands-on courses at the University of Munich and at interna¬ 
tional summer schools. He has given dozens of research talks at 
international conferences, including the keynote at the International 
Web Engineering Conference 2011. He has also held several tutorials 
both at academic and developer conferences. 


Mark Grover 

Mark is a Software Engineer at Cloudera and a contributor 
to the Apache Hive open-source project. He is also a sec¬ 
tion author of O'Reilly's book on Apache Hive called “Pro¬ 
gramming Hive." Mark is an active respondent on the Hive 
mailing list and IRC channel. 




Justin Hancock 

Justin is a tech-industry veteran with more than 15 years' 
experience across a number of industries. He has previ¬ 
ously worked as an independent consultant, helping archi¬ 
tect, design and deploy Deutsche Telekom's first Hadoop 
implementation. Prior to joining Cloudera, Justin worked as a Hadoop 
Architect/Developer for DataSift. DataSift had Europe's largest HBase 
cluster with JPB of storage. Justin now works in Cloudera's Customer 
Operations team as an Operations Engineer supporting Cloudera's cus¬ 
tomers across the planet. 

Justin has spoken at CRM Conferences in Asia, delivered customer 
training in that region, and provided mentoring to new and junior staff. 
Justin is from the U.K. and is married with one daughter. He is a very 
keen cyclist and can be found going very fast downhill on plastic bikes 
either on or off road. 


Jill Jacobucci 

Jill, Senior Manager of Hadoop Solution Architecture at 
Sears Holding, has 17 years of experience in Information 
Technology, working mainly in the middleware and distrib¬ 
uted infrastructure for Web applications area before mov¬ 
ing onto the big data platform. She enjoys the challenge of using a 
highly distributed platform to solve real business problems. As man¬ 
ager of the Hadoop Solution Architecture team, Jill's team works with a 
number of Hadoop projects by developing solutions using the Hadoop 
ecosystem tools. 

Dave Jespersen 

Dave brings his deep engineering experience to his role of 
chief customer advocate at MapR Technologies. He en¬ 
riches the customer experience by working with MapR's 
customer base to develop and implement innovative solu¬ 
tions to the complex problems faced by every enterprise. 

He was previously VP of Engineering at MapR, where he led the devel¬ 
opment of MapR's industry-leading products. Dave has 30 years of suc¬ 
cessful enterprise software development experience in both small and 
large companies, including EMC, Sun Microsystems, Sterling Software, 
Spectra Logic, Exabyte and DEC. Dave was educated at Brigham Young 
University, where he earned a BS M.E. and a minor in Computer Science. 





Jock Mackinlay 

Jock is Tableau Software's Senior Director of Visual Analy¬ 
sis. At Stanford University, he pioneered the automatic de¬ 
sign of graphical presentations of relational information. 
He joined Xerox PARC in 1986, where he collaborated with 
the User Interface Research Group to develop many novel applications 
of computer graphics for information access, coining the term “Infor¬ 
mation Visualization." Much of the fruits of this research can be seen in 
his book, “Readings in Information Visualization: Using Vision to 
Think." Jock has a Ph.D. in computer science from Stanford University. 


Gabor Melli 

Dr. Gabor is the Chief Scientist at VigLink, focusing on au¬ 
tomated semantic mapping of large text corpora to large 
knowledge bases, along with A/B testing frameworks. Pre¬ 
viously, Gabor founded PredictionWorks, a data mining 
and analytics company that specializes in real-time predictive model- 
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ing solutions. He has worked in retailing, telecommunications, banking 
and the software industry to help companies such as WalMart, Mi¬ 
crosoft, AT&T, T-Mobile, and Washington Mutual, and applied his ex¬ 
pertise in data analysis and machine learning to significantly improve 
their business processes. 

Gabor's research interests include machine learning and semantic 
analysis with special focus on semi-supervised and active learning. He 
has published extensively in these areas and is actively semi-automati¬ 
cally creating a knowledge base on artificial intelligence at 
www.gabormelli.com/RKB. He is also the recipient of ACM's SIGKDD 
2013 Service Award. 


Michael Miller 

Mike is Chief Scientist at Cloudant, where he develops and 
evangelizes the company's technical vision and manages 
long-term product R&D. While at MIT as a Postdoctoral 
Fellow, he cofounded Cloudant after cutting his teeth on 
petabyte-per-second problems at the Large Hadron Collider. Mike 
holds a B.S. in Physics and a B.A in Philosophy from Michigan State 
University, a Ph.D. in Physics from Yale University, and is an Affiliate 
Professor of Particle Physics at the University of Washington. He has 
more than a decade's worth of experience as a builder of the most ex¬ 
treme Big Data systems on earth, as well as extensive experience lectur¬ 
ing on mathematics, physics, data science, and philosophy at the 
graduate and undergraduate level. 

Deep Mistry 

Deep is a consultant at Open Software Integrators, a U.S. 
firm specializing in NoSQL/Big Data development with of¬ 
fices in Chicago and Durham, N.C. Deep has been pro¬ 
gramming for more than eight years and has worked on 
multiple software engineering projects, from developing Big Data train¬ 
ing materials to implementing large data systems requiring Internet- 
speed response times. He has been heavily involved with MongoDB, 
Neo4j, Couchbase and Hadoop since their births onto the Big Data 
scene. You can find his white papers and Big Data blogs on the Open 
Software Integrators website. Deep received his Masters in Computer 
Science from North Carolina State University. 




Jacques Nadeau 

Jacques leads Apache Drill development efforts at MapR 
Technologies. He is an industry veteran with more than 15 
years of Big Data and analytics experience. Most recently, 
he was cofounder and CTO of search engine startup 
YapMap. Before that, he was director of new product engineering with 
Quigo (contextual advertising, acquired by AOL in 2007). He also built 
the Avenue A Razorfish analytics data-warehousing system and associ¬ 
ated services practice (acquired by Microsoft). 

Paco Nathan 

Paco is the Director of Data Science at Concurrent in San 
Francisco and a committer on the Cascading open-source 
project. He has expertise in Hadoop, R, Amazon Web Serv¬ 
ices, machine learning, predictive analytics, and more than 
25 years in the tech industry overall. For more than 10 years, Paco has 
led innovative data science teams, building large-scale apps. He is also 
the author of “Enterprise Data Workflows with Cascading." Previously a 





Computer Science instructor at Stanford University, he is now teaching 
professional workshops about data science, Big Data, machine learn¬ 
ing, and more. 

Josh Patterson 

Josh is a Principal Solution Architect at Cloudera. Prior to 
joining Cloudera, he was responsible for bringing Hadoop 
into the smart grid during his involvement in the openPDC 
project. His focus in the smart grid realm with Hadoop and 
HBase was using machine learning to discover and index anomalies in 
time-series data. Josh spent three years as a Principal Solutions Archi¬ 
tect with Cloudera helping Fortune 100 companies build out their 
Hadoop and machine-learning pipelines. 

Josh is a graduate of the University of Tennessee at Chattanooga 
with a Bachelor's in Business Management and a Master's of Computer 
Science with a thesis titled “TinyTermite: A Secure Routing Algorithm," 
where he worked in mesh networks and social insect swarm algo¬ 
rithms. Josh has spent more than 15 years in software development, 
and he continues to contribute to projects such as Apache Mahout, 
Metronome, IterativeReduce, openPDC, and JMotif in the open-source 
community. 

Krishnan Raman 

Krishnan is a data scientist at Twitter. He was formerly a 
risk quant at Bank of America, an associate at Goldman 
Sachs, and an engineer at Sun Microsystems. His experi¬ 
ence in building the real-time proprietary trading system 
WebET at Goldman Sachs, and concurrent Scala systems to compute 
the conditional value at risk of large credit portfolios at BAC, have put 
him in good stead at the Revenue Quality team at Twitter. His primary 
tools are Scala, Scalding and a dash of statistics and math. He has grad¬ 
uate degrees in math, computer science and mathematical finance 
from the University of Chicago. 

Jairam Ranganathan 

Jairam is the Director of Product Strategy at Cloudera, 
where he is responsible for planning the road map of 
Cloudera products. Before Cloudera, he spent a decade at 
VMware, where among other things he was one of the de¬ 
velopers on vMotion, storage vMotion, and the distributed manage¬ 
ment framework for vSphere. 

Dan Rosanova 

Dan is a four-time Microsoft Integration MVP with 14 years 
of experience delivering solutions on Microsoft and Solaris 
platforms in the financial services, insurance, banking, 
telecommunications, and logistics industries. He has spe¬ 
cialized in high-volume and low-latency distributed applications. His 
recent focus has been on Hadoop, evolutionary computation and GPU 
computing. Dan speaks frequently on leading-edge technology and its 
impact on the enterprise landscape. Dan is the author of “Microsoft 
BizTalk Server 2010 Patterns." Dan is a senior architect in the Technol¬ 
ogy Integration practice at West Monroe Partners, an international, full- 
service business and technology-consulting firm focused on guiding 
organizations through projects that fundamentally transform their 
businesses. 
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Eddie Satterly 

Eddie is Chief Big Data Evangelist at Splunk, and has 
served in a variety of roles, including developer, engineer, 
architect and CTO over his 23-year career. He has been a 
longtime Big Data user, even before it was the cool thing to 
do. More recently, he was able to revolutionize the way a leading online 
travel agency delivers their core Web applications that resulted in im¬ 
proved user experience. He created a highly scalable and flexible Big 
Data environment using best-in-breed tools, and as a result, was able 
to retire 35 other systems. 

Eddie has done guest lectures at universities, and presents at several 
conferences and symposiums yearly. He is a recognized expert in the 
field of Big Data and has presented at many global conferences on the 
topic. Eddie has a B.S. in Computer Science from Indiana University. 


401k Justin Sheppard 

i -79 1 Justin is an IT Director with Sears Holdings and Head of 

I , r Business Operations for MetaScale, a big data technology 
V 'Jm v subsidiary of Sears. Justin is leading Sears Holdings efforts 
to harness the power of Hadoop and other open-source 
technologies to deliver business value from data. Prior to joining Sears, 
Justin was with Deloitte Consulting for 12 years serving Fortune 500 
clients. Justin received his MBA from the University of Chicago. 

Dave Stokes 

David is a MySQL Community Manager for Oracle, and 
previously was the Certification Manager for MySQL AB. 
He has worked for companies ranging from the American 
Heart Association to Xerox. 



Michael Segel 

Michael is the President and CEO of Segel & Associates and 
works with clients to assist with their strategy and imple¬ 
mentation of Hadoop. Michael has been working primarily 
in the Big Data Space since 2009 and founded the Chicago 
Hadoop User Group (CHUG). Having been described as someone who 
has a face for Radio, Michael tries to stay behind the scenes but some¬ 
times gets prodded into speaking at meetups and has done 
Hadoop/HBase training for customers. 

Michael received his bachelor's degree in Computer Science from 
the College of Engineering at Ohio State University. 

Jonathan Seidman 

Jonathan is a Solutions Architect on the Partner Engineer¬ 
ing team at Cloudera. Before joining Cloudera, he was a 
Lead Engineer on the Big Data team at Orbitz Worldwide, 
helping to build out the Hadoop clusters supporting the 
data-storage and analysis needs of one of the most heavily trafficked 
sites on the Internet. Jonathan is also a cofounder and organizer of the 
Chicago Hadoop User Group and the Chicago Big Data Meetup, and a 
frequent speaker on Hadoop and Big Data at industry conferences such 
as Hadoop World, Strata and OSCON. 

Tony Shan 

B; . w Tony is a renowned thought leader and technology vision- 
M ary with decades of experience and guru-level knowledge 
I on emerging technologies for pragmatic enterprise com- 
puting. He has directed and led the life-cycle design of 
complex distributed systems on diverse platforms in Fortune 50 com¬ 
panies and big public-sector organizations. 

He drove innovations with insightful consulting and advising on large- 
scale high-profile projects that won many rewards. He authored dozens of 
top-notch publications and more than 10 books on next-generation tech¬ 
nologies. He wrote multiple entries on architecture and methodology to 
IT encyclopedias. He is a regular keynote speaker and chair, moderator, 
advisor, and organizing committee member in preeminent conferences; 
an editor and editorial advisory board member of IT research journals 
and books; and a founder of several user groups and forums. In particular, 
he is a world-leading authority in the Big Data and cloud space, delivering 
scores of presentations, panels and workshops in various industry events, 
and serving general chair in international conferences. He has extensive 
speaking experience at conferences and industry events. 




Kathleen Ting 

Kathleen is a Support Manager at Cloudera, is a committer 
on the Apache Sqoop project, and has spoken at many Big 
Data conferences, such as Hadoop World on Map/Reduce; 
at HBaseCon on HBase; at Strange Loop on ZooKeeper; 
and at Hadoop Summit on Sqoop. 




Siva Vaidyanatha 

Siva is the Chief Technology Officer for the Retail, Con¬ 
sumer Goods, Life Sciences and Logistics Business Unit at 
Infosys. He is a member of the unit Executive Council and 
is also responsible for the setup, organization and delivery 
of technology consulting services aligned with the unit's strategic plans. 

Siva has about 16 years of industry experience, and has spent the 
last 10 years with Infosys in various technology leadership roles. He is 
recognized as a technology visionary and has incubated several innova¬ 
tive technology products and solutions. He has also authored two 
books on next-generation architecture and Big Data. 

Siva is on the Board of Directors of the Parkland Center for Clinical 
Innovation. PCCTs vision is to help transform the delivery of healthcare 
by developing cutting-edge software and analytic methods to improve 
the quality and safety of care at the individual and population levels. 

Siva received his bachelor's degree in Engineering from the Indian 
Institute of Technology, Madras, and his master's degree in Business 
Administration from the SMU-Cox School of Business. 



Vinithra Varadharajan 

Vinithra is a Software Engineer at Cloudera. She builds 
tools for Hadoop life-cycle management, with a focus on 
automatic configuration of Hadoop clusters, and setting up 
High Availability and Disaster Recovery systems. 



Anton Yazovskiy 

Anton is a Software Engineer at Thumbtack Technology, 
where he focuses on high-performance enterprise archi¬ 
tecture. He has presented at a variety of IT conferences and 
“Dev Days" on topics such as NoSQL and MarkLogic. 

Anton has been an active user of many NoSQL databases, including 
Cassandra, MongoDB, MarkLogic, Aerospike and HBase. Like many 
people, he learned some of the difficulties behind polyglot persistence 
the hard way, and is hoping his talk will help others avoid making some 
of the same mistakes he made. 
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Big Data TechCon will be held at the Hyatt Regency, just outside of San Francisco. 


Hyatt Regency Burlingame 

1333 Bayshore Highway 
Burlingame, CA, 94010 
Phone: +1-650-347-1234 
Fax: +1-650-696-2669 

www.sanfranciscoairport.hyatt.com 

Special Big Data TechCon 
Discounted Rates 

Take advantage of special discounted room rates at the 
Hyatt Regency— only US$185 per night for single/double 
occupancy 



Rooms for the reduced rate are limited! 

Click here to make your hotel reservation 
or use the “Make Hotel Reservation” link 
on the confirmation page of your registration. 

Reservations at the reduced rate can be made through 5:00 pm 
Eastern time on October 5, 2013 — assuming they don’t sell out. 
The number of rooms in the discounted block is LIMITED and 
historically rooms sell out well before the deadline. Don’t wait 
until the last minute to reserve your hotel rooms! 

This rate is available throughout Big Data TechCon. Those who 
reserve their hotel rooms via this reservation link will receive: 

• Complimentary wireless Internet service in their rooms. 

• Overnight self-parking discounted to $8 per day. 

The Hyatt does offer valet parking options at $25 per day. 

Hotel Highlights 

The Hyatt Regency Burlingame is a newly updated hotel lo¬ 
cated on San Francisco Bay between the excitement of downtown 
San Francisco and the technology corridor of Silicon Valley. 



Parking 

The Hyatt parking garage makes for easy arrivals and fast depar¬ 
tures via self parking. A full day of parking is $8. The Hyatt does 
offer valet parking options — at $25/day. 

Complimentary Shuttle Service 

The shuttle is available every day, 24-hours a day and runs every 
10-15 minutes. Take your luggage to the Departures Level, center 
island, and look for the area marked “Hotel Shuttle.” The shuttle is 
a large bus marked “Hyatt Regency and Marriott.” For arrivals 
from Midnight-4:46 am, shuttles pick up every 30 minutes. 

Driving Directions 

From San Francisco International Airport (2 miles): 

Take 101 South toward San lose. Exit Millbrae Ave. Turn left on 
Millbrae Ave. Turn right at the second stoplight onto Bayshore 
Hwy. Proceed through 4 stoplights. Hotel is on right hand side. 

From Oakland Airport (approximately 30 miles) and Points East: 

Take 1-880 South toward San lose. Merge onto CA-92 W toward 
San Mateo Br. Merge onto US-101 N toward San Francisco to the 
Broadway Exit. Take the Airport Blvd ramp toward Bayshore Blvd, 
then turn left onto Bayshore Hwy to the hotel. 

From San lose Airport (approximately 30 miles) and Points South: 

Take 101 North to the Broadway Exit. Take the Airport Blvd ramp 
toward Bayshore Blvd, then turn left onto Bayshore Hwy to the 
hotel. 
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Getting Approval 


Try These 11 Time-Tested Tactics: 


1. STUDY. Note the HOW-TO classes and tutorials at 

Big Data TechCon focused on the latest Big Data technologies, 
especially those that are best aligned with your company’s exist¬ 
ing IT infrastructure. Say that this is your first, and most practical, 
opportunity to bring Big Data to your business. 

2. PREPARE. Download the course catalog and circle the classes 
you want to take, and explain why the topics relate to your Big Data 
technical efforts. Show that you have found many sessions that 

fit your specific needs, and your company’s strategic 
goals. 


“Big Data TechCon has a great atmosphere, organization and 
extra activies, even a morning run. It’s a great time with many 
experts to learn from.” 

—Jarek Jarcec Cecho, Software Engineer, Cloudera 


3. JUSTIFY Go in armed with all the neces¬ 
sary materials to make a good case for how 
your attending Big Data TechCon will help 
your company make money, save money or 
improve productivity by helping you capture 
and analyze the data that drives your business. 


Big Data 
gets REAL at 

BigData 

TECHCON 



4. SHARE ■ Promise to come back from Big Data 
TechCon and hold a brown-bag lunch session to share what 
you've learned with your colleagues, or even conduct formal 
training within your department. In fact, maybe you’ll want to 
schedule a series of brown-bag lunches. 

5. PLAN. Tell management that after you attend 

Big Data TechCon, you'll make definite action plans and 
recommendations to implement new Big Data plans, and to 
improve how your company uses all of the data it captures. 


“Big Data TechCon is a great way to raise your awareness on 
what’s out there for Big Data and gives you ideas on what to 
dig into.” 

—Corey Andalora, Sr. Java Developer, Dealer.com 


6. RELATE. Show how problems or issues you’ve recently 
encountered fit with the classes at Big Data TechCon, and 
discuss the types of technology discussions you'll have with 
the conference faculty and other IT professionals. 

7. SAVE. The tuition and travel expense of attending Big Data 
TechCon is less than many other conferences. The earlier you sign 
up, the more you save, so explain the benefit of signing up early, 
and for booking your hotel room before the cutoff. 

8. TEAM . Save even more with group discounts. Send three or 
more employees from your company, and save $100 off per 
person. Each person can take different classes and bring back 
even more valuable tips and techniques. (Sending 

10 or more? Contact us for arrangements.) 

9. GROUP. User groups, government employees, 
non-profits and professionals employed by or attending educa¬ 
tional institutions can also receive special savings. Check the 
website or ask Stacy Burris sburris@bzmedia.com) about custom 
options for your group. 



10. LAUNCH . Classes at Big Data TechCon help you get a 
jump-start on every aspect of Big Data that you have been talking 
about implementing (but haven’t) for months. Whether it’s 
Hadoop, graph databases, NoSQL or another new technology, 
explain that you’ll find the answers here. 

11. DECIDE . While you can sign up anytime, your company 
will save the most if you beat the deadlines. Explain that you 
will help your company’s bottom line by signing up for 

Big Data TechCon today! 
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Register by 

Register by 

Register by 

After 


Aug. 2 

Aug. 30 

Sept. 27 

Sept. 27 

Three-Day Conference 

$1,195 

$1,295 

$1,395 

$1,595 

October 15-17 

SAVE $400 

SAVE $300 

SAVE $200 


Exhibit Hall Only 

FREE 

FREE 

FREE 

FREE 

October 16-17 





Register Online TODAY at www.BigDataTechCon.com! 

Three-Day Conference 


How to Register 

Cancellation and Refund Policy 


Registration Includes: 

• Admission to tutorials and 
technical classes on October 15,16,17 

• Admission to keynotes 

• Admission to the Exhibit Hall 

• Admission to all special events, 
including the Networking Reception 

• Downloadable conference materials 

• Coffee breaks and lunch where indicated 

Exhibit Hall Only 
Registration Includes: 

• Admission to the Exhibit Hall 
•Admission to Networking Reception 



Register online and use one of the 
following payment methods: 

Credit Card. You can use the secure 
online form to pay via credit card and 
get immediate confirmation of your regis¬ 
tration. MasterCard, Visa and American 
Express are accepted. You'll receive a 
registration record and receipt. Please 
print out these pages and bring them with 
you to the Conference. Present them at the 
Registration Desk to pick up your badge 
and course materials. 

Check. Fill out the online registration 
form. Print out the registration record and 
receipt and mail them to BZ Media LLC, 225 
Broadhollow Road, Suite 211, Melville, NY 
11747, with your payment. Online registra¬ 
tions that are mailed without payment will 
not be confirmed until payment is received. 

Purchase Order. If you register using 
a P.O., you'll be invoiced immediately for 
the registration amount. Payment must 
be received before your registration can be 
confirmed. 


You can receive a full refund, less 
a $150 registration fee, for cancellations 
made by Friday, Aug.30, 2013. Cancella¬ 
tions after this date are non-refundable. 
Send your cancellation in writing to 
registration@bzmedia.com. Registrations 
may be transferred to another person. 

Refunds will be processed through the 
same method of payment as the initial 
payment transaction. Credit-card refunds 
will be processed to the same credit card 
as the original payment. 

If for reasons beyond our control 
the conference cannot take place as 
scheduled, BZ Media reserves the right to 
reschedule the conference to a date and 
place of it’s choosing. 

Questions 

Contact Stacy Burris, Event Director, at 
sburris@bzmedia.com or 
+1-631-421-4158 x!08. 


Special Discounts 

You may combine one of these special discounts with the Early Registration pricing to save even more! 


Group. Group discounts will be given automatically if you 
register three or more people at once. You can also contact 
Stacy Burris at sburris@bzmedia.com to receive the $100/person 
discount if your group is unable to register at the same time. 
Contact her also for special discounts for groups of 10 or more. 

Government Employees. Federal, State and Focal Government 
i* < ,, employees can receive an additional $100 off the 
Hbr RMiisTRATioN Three-Day Conference price. Enter code GOV in the 
discount code field. CCR-registered indicates that we are listed in 
the primary supplier database for the Federal Government. 


Educational Institutions. Personnel employed by or attending 
educational institutions can get a $100 discount off the 
Three-Day Conference price by using the code EDU. 

User Groups. Contact Stacy Burris at sburris@bzmedia.com to 
see if your group is eligible for a discount. 

Non-Profit Organizations. Personnel employed by non-profit 
organizations can get a $100 discount off the Three-Day 
Conference price by using the code NONPROFIT. 
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